XML :: Just ignoring emdash tag?

I am using XML Simple to parse an XML file, the problematic part is as follows:

    <textBody>
        <title>
            <titlePart>
                <text>SECTION A <emdash/> HUMAN NECESSITIES</text>
            </titlePart>
        </title>
    </textBody>
    <ipcEntry kind="t" symbol="A01" ipcLevel="C" entryType="K" lang="EN">
        <textBody>
            <title>
                <titlePart>
                    <text>AGRICULTURE</text>
                </titlePart>
            </title>
        </textBody>
    </ipcEntry

for some reason XML :: Simple completely ignores <text>SECTION A <emdash/> HUMAN NECESSITIES</text> I think this is because the emdash tag is because it <text>AGRICULTURE</text>parses just fine. I also tried setting the parser:

$XML::Simple::PREFERRED_PARSER = 'XML::Parser';

still no. Any idea?

+3
source share
2 answers

, , , " ". XML::Simple ( , ). XML:: , , , . "". :

(, , ) - . , XML:: Simple

XML-. XML::LibXML XML::Twig .

, , XML, , . , XML:: Simple :

<text>SECTION A &#8212; HUMAN NECESSITIES</text>

. (&#8212; - em.)

+5

XML::Simple , , , :

(, , ) - . , XML:: Simple - .

, :

use Data::Dumper;
use XML::Simple;
print Dumper(XMLin(qq{
    <textBody>
        <title>
            <titlePart>
                <text>SECTION A <emdash/> HUMAN NECESSITIES</text>
            </titlePart>
        </title>
    </textBody>
}));

:

$VAR1 = {
    'title' => { 
        'titlePart' => { 
            'text' => { 
                'emdash' => {}, 
                'content' => [ 
                    'SECTION A ', 
                    ' HUMAN NECESSITIES'
                ]
            }
        }   
    }
};

, emdash , .

+4

Source: https://habr.com/ru/post/1776053/


All Articles