I use the XML :: Twig module to remove all comments from an XML file. The sample file may be -
<?xml version="1.0" encoding="UTF-8"?> <Node_A> node A content 1 <![CDATA[this portion within the two comments is being REMOVED which is not the intention]]> node A content 3 <![CDATA[this portion within the two comments is being REMOVED which is not the intention]]> <![CDATA[ this portion is fine]]> <Node_B> node B content <Node_C> node c content </Node_C> some data one some data again two few more </Node_B> </Node_A>
I used the script as -
#!/usr/bin/perl use strict; use warnings; use XML::Twig; my $infile = 'demo.xml'; my $twig = XML::Twig->new (comments => 'drop', pretty_print => 'indented')->parsefile($infile); $twig->print ();
This script removes the "CDATA" part in two comments, this is not my intention. The way out is -
<?xml version="1.0" encoding="UTF-8"?> <Node_A> node A content 1 <![CDATA[ this portion is fine]]><Node_B> node B content <Node_C> node c content </Node_C> some data one some data again two few more </Node_B></Node_A>
What should I add in order to save the whole part of CDATA and other things as it is, just delete comments?
Thanks in advance.
source share