Incorrect PHP character error

I get this error when running this code: Fatal error: Uncaught exception 'DOMException' with message 'Invalid Character Error' in test.php:29 Stack trace: #0 test.php(29): DOMDocument->createElement('1OhmStable', 'a') #1 {main} thrown in test.php on line 29

Nodes that contain invalid characters from the source XML file, but since I am deleting invalid characters from nodes, you must create nodes. What type of coding do I need to do for the original XML document? Do i need to decode saveXML?

 function __cleanData($c) { return preg_replace("/[^A-Za-z0-9]/", "",$c); } $xml = new DOMDocument('1.0', 'UTF-8'); $xml->load('test.xml'); $xml->formatOutput = true; $append = array(); foreach ($xml->getElementsByTagName('product') as $product ) { foreach($product->getElementsByTagName('name') as $name ) { $append[] = $name; } foreach ($append as $a) { $nodeName = __cleanData($a->textContent); $element = $xml->createElement(htmlentities($nodeName) , 'a'); } $product->removeChild($xml->getElementsByTagName('details')->item(0)); $product->appendChild($element); } $result = $xml->saveXML(); $file = "data.xml"; file_put_contents($file,$result); 

Here's what the original XML looks like:

 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?> <products> <product> <modelNumber>M100</modelNumber> <itemId>1553725</itemId> <details> <detail> <name>1 Ohm Stable</name> <value>600 x 1</value> </detail> </details> </product> </products> 

The new document should look like this:

  <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?> <products> <product> <modelNumber>M100</modelNumber> <itemId>1553725</itemId> <1 Ohm Stable> </1 Ohm Stable> </product> </products> 
+6
source share
4 answers

You just can't use the start element name with number

 1OhmStable <-- rename this _1OhmStable <-- this is fine 

php parse xml - error: StartTag: invalid element name

Good article: http://www.xml.com/pub/a/2001/07/25/namingparts.html

A Name is a token starting with a letter or one of several punctuation characters and continuing with letters, numbers, hyphens, underscores, colons, or complete stops, also called name characters.

+10
source

You did not indicate where you are getting this error. In case it clears the value, this is my guess:

 preg_replace("/[^A-Za-z0-9]/", "",$c); 

This replacement is not written for UTF-8 encoded strings (which are used by DOMDocument). You can make it compatible with UTF-8 using u -modifier (PCRE8) & shy; Docs :

 preg_replace("/[^A-Za-z0-9]/u", "",$c); ^ 

This is just an assumption, I suggest you clarify in your question which part of the code is causing the error.

+5
source

Even if __cleandata() removes all other characters except the Latin alphabets az and numbers, this does not necessarily guarantee that the result is a valid XML name. Your function can return lines starting with a number, but numbers are illegal start names in XML, they can only appear in the name after the first name. Names are also forbidden in names, so this is another point at which your expected XML output will fail.

+1
source

Make sure the scripts have the same encoding: if the UTF makes sure that they do not have a byte mark (BOM) at the very beginning of the file. To do this, open your XML file with a text editor, such as Notepad ++, and convert it to "UTF-8 without specification".

I had a similar error but with json file

0
source

Source: https://habr.com/ru/post/903861/


All Articles