Perl XML :: Twig - saving quotes in and around attributes

I selectively capture some elements and attributes. Unfortunately, our input files contain both single and double quotes. In addition, some attribute values ​​contain quotation marks (inside the value).

Using XML :: Twig, I don't see how to keep all quotes around attribute values.

Here is an example code:

use strict; use XML::Twig; my $file=qq(<file> <label1 attr='This "works"!' /> <label2 attr="This 'works'!" /> </file> ); my $fixes=0; # count fixes my $twig = XML::Twig->new( twig_handlers => { '[@attr]' => sub {fix_att(@_,\$fixes);} }, # ... keep_atts_order => 1, keep_spaces => 1, keep_encoding => 1, ); #$twig->set_quote('single'); $twig->parse($file); print $twig->sprint(); sub fix_att { my ($t,$elt,$fixes) =@ _; # ... } 

The above code returns invalid XML for label1:

 <label1 attr="This "works"!" /> 

If I add:

 $twig->set_quote('single'); 

Then we will see invalid XML for label2:

 <label2 attr='This 'works'!' /> 

Is it possible to save existing quotes? Or is there a better approach to selectively commit branches?

+4
source share
1 answer

Is there any specific reason for using keep_encoding ? Without it, the quote is correctly encoded.

keep_encoding used to save the original encoding of the file, but there are other ways to do this. It was used mainly in the period before 5.8, when the encodings did not work as smoothly as they do now.

+1
source

Source: https://habr.com/ru/post/1484704/


All Articles