XPath variables in XML :: Twig or Other

I use XML :: Twig :: XPath to work with ITS data and try to figure out how to resolve XPath expressions with variables in them. Here is an example of what I need to work with the ITS spec :

<its:rules version="2.0"> <its:param name="LCID">0x0409</its:param> <its:translateRule selector="//msg[@lcid=$LCID]" translate="yes"/> </its:rules> 

I need to evaluate the XPath expression contained in the selector , with the variable value being the content of its:param element. I donโ€™t understand how to do this. The XML :: XPath documentation mentions variables (which I suppose should be part of the context), and it even has a class to represent them, but the documentation does not say how to specify variables in context. I would not even know how to access such functions from XML :: Twig, if at all possible.

Does anyone know how to do this? Or, alternatively, can you give an example of using such functions with another module, such as XML :: LibXML (which often mentions variables, but leaves me a little unsure about how to do this with variables that are strings)?

+4
source share
4 answers

libxml2 and XML :: LibXML supports XPath 2.0 paths and their variables.

 use XML::LibXML qw( ); use XML::LibXML::XPathContext qw( ); sub dict_lookup { my ($dict, $var_name, $ns) = @_; $var_name = "{$ns}$var_name" if defined($ns); my $val = $dict->{$var_name}; if (!defined($val)) { warn("Unknown variable \"$var_name\"\n"); $val = ''; } return $val; } my $xml = <<'__EOI__'; <r> <ex="a">A</e> <ex="b">B</e> </r> __EOI__ my %dict = ( x => 'b' ); my $parser = XML::LibXML->new(); my $doc = $parser->parse_string($xml); my $xpc = XML::LibXML::XPathContext->new(); $xpc->registerVarLookupFunc(\&dict_lookup, \%dict); say $_->textContent() for $xpc->findnodes('//e[@x=$x]', $doc); 
+3
source

If you used an engine that only supports XPath 1.0 paths, you can consider this value as a template whose grammar is:

 start : parts EOI parts : part parts | part : string_literal | variable | other 

The following is the XPath from the XPath template.

 sub text_to_xpath_lit { my ($s) = @_; return qq{"$s"} if $s !~ /"/; return qq{'$s'} if $s !~ /'/; $s =~ s/"/", '"', "/g; return qq{concat("$s")}; } my $NCNameStartChar_class = '_A-Za-z\xC0-\xD6\xD8-\xF6\xF8-\x{2FF}\x{370}-\x{37D}\x{37F}-\x{1FFF}\x{200C}-\x{200D}\x{2070}-\x{218F}\x{2C00}-\x{2FEF}\x{3001}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFFD}\x{10000}-\x{EFFFF}'; my $NCNameChar_class = $NCNameStartChar_class . '\-.0-9\xB7\x{300}-\x{36F}\x{203F}-\x{2040}'; my $NCName_pat = "[$NCNameStartChar_class][$NCNameChar_class]*+"; 

 my $xpath = ''; for ($xpath_template) { while (1) { if (/\G ( [^'"\$]++ ) /xgc) { $xpath .= $1; } elsif (/\G (?=['"]) /xgc) { /\G ( ' [^\\']*+ ' | " [^\\"]*+ " ) /sxgc or die("Unmatched quote\n"); $xpath .= $1; } elsif (/\G \$ /xgc) { /\G (?: ( $NCName_pat ) : )?+ ( $NCName_pat ) /xgc or die("Unexpected '\$'\n"); my ($prefix, $var_name) = ($1, $2); my $ns = $ns_map{$prefix} or die("Undefined prefix '$prefix'\n"); $xpath .= text_to_xpath_lit(var_lookup($ns, $var_name)); } elsif (/\G \z /xgc) { last; } } } 

var_lookup example:

 sub var_lookup { my ($ns, $var_name) = @_; $var_name = "{$ns}$var_name" if defined($ns); my $val = $params{$var_name}; if (!defined($val)) { warn("Unknown variable \"$var_name\"\n"); $val = ''; } return $val; } 

Unverified.

+2
source

Here is the complete solution.

I walked around the โ€œwhat is Qnameโ€ part by creating a regular expression from the parameter names already found. it can be slow if there are many parameters, but it works fine with the W3C example; creating regexp means escaping each name between \ Q / \ E, so the metacharacters in the names are ignored, sorting the names by length, so the shorter name does not match the longer one and then is appended to them by the character '|',

Limitations:

  • there is no error when using a parameter that was not previously defined,
  • namespaces in selectors are not processed, which is easy to add, if you have real data, just add the corresponding map_xmlns ,
  • the entire document is loaded into memory, which is difficult to avoid if you want to use universal XPath selectors

Here he is:

 #!/usr/bin/perl use strict; use warnings; use XML::Twig::XPath; my %param; my $mparam; my @selectors; my $t= XML::Twig::XPath->new( map_xmlns => { 'http://www.w3.org/2005/11/its' => 'its' }, twig_handlers => { 'its:param' => sub { $param{$_->att( 'name')}= $_->text; $match_param= join '|', map { "\Q$_\E" } sort { lenght($b) <=> length($a) } keys %param; }, 'its:translateRule[@translate="yes"]' => sub { my $selector= $_->att( 'selector'); $selector=~ s{\$($mparam)}{quote($param{$1})}eg; push @selectors, $selector; }, }, ) ->parse( \*DATA); foreach my $selector (@selectors) { my @matches= $t->findnodes( $selector); print "$selector: "; foreach my $match (@matches) { $match->print; print "\n"; } } sub quote { my( $param)= @_; return $param=~ m{"} ? qq{'$param'} : qq{"$param"}; } 
+2
source

In XML :: XPath, you can set variables in an XML :: XPath :: Parser object. It doesn't seem to be directly accessible via the XML :: XPath object; you must use $xp->{path_parser} , which is undocumented to get to it. Here is an example with a string variable, as well as with a node set variable:

 use XML::XPath; use XML::XPath::Parser; use XML::XPath::Literal; my $xp = XML::XPath->new(xml => <<'ENDXML'); <?xml version="1.0"?> <xml> <a> <stuff foo="bar"> junk </stuff> </a> </xml> ENDXML #set the variable to the literal string 'bar' $xp->{path_parser}->set_var('foo_att', XML::XPath::Literal->new('bar')); my $nodeset = $xp->find('//*[@foo=$foo_att]'); foreach my $node ($nodeset->get_nodelist) { print "1. FOUND\n\n", XML::XPath::XMLParser::as_string($node), "\n\n"; } #set the variable to the nodeset found from the previous query $xp->{path_parser}->set_var('stuff_el', $nodeset); $nodeset = $xp->find('/*[$stuff_el]'); foreach my $node ($nodeset->get_nodelist) { print "2. FOUND\n\n", XML::XPath::XMLParser::as_string($node), "\n\n"; } 
0
source

Source: https://habr.com/ru/post/1487982/


All Articles