XMLReader Encoding Error

I have a PHP script that is trying to parse a huge XML file. For this, I use the XMLReader library. During parsing, I have this coding error:

Entering the wrong UTF-8, specify the encoding! Bytes: 0xA0 0x32 0x36 0x30

I would like to know if they are a way to skip entries with bad characters.

Thank!

+1
source share
4 answers

First of all, make sure that your XML file is indeed encoded in UTF-8 encoding. If you do not specify the encoding as the second parameter XMLReader::open().

UTF-8, PHP > 5.2.0, LIBXML_NOERROR / ( ) LIBXML_NOWARNING XMLReader::open():

$xml = new XMLReader(); 
$xml->open('myxml.xml', null, LIBXML_NOERROR | LIBXML_NOWARNING); 

PHP > 5.1.0, libXML .

// enable user error handling
libxml_use_internal_errors(true);
/* ... do your XML processing ... */
$errors = libxml_get_errors();
foreach ($errors as $error) {
    // handle errors here
}
libxml_clear_errors();

, XMLReader . .


:

libXML XML_PARSE_RECOVER (1), ext/libxml PHP. , 1 $options.

$xml = new XMLReader(); 
$xml->open('myxml.xml', null, LIBXML_NOERROR | LIBXML_NOWARNING | 1); 
+8

, XMLReader. , ASCII, () UTF-8 ISO-8859-1 ASCII 128 . , ISO-8859-1, ASCII . , XML, UTF-8.

ISO-8859-1 0xA0 0x32 0x36 0x30 : , "2", "6", "0".

+2

XML , " " , ( , ) .

XML , .

0
source
$ xml = file_get_contents ('myxml.xml');
$ xml = preg_replace ('/ [\ x0- \ x1f \ x7f- \ x9f] / u', '', $ xml);
// parse $ xml below

0
source

Source: https://habr.com/ru/post/1761645/


All Articles