Remove & amp from string when writing to xml in PHP

I am trying to write an XML file using a DOMDocument link containing a character and a character. When I try to do this, the link also becomes in xml. Therefore, from product=1&qty;=1 becomes product=1&qty;=1 .

Could you tell me how to avoid this?

+6
source share
2 answers

Ampersands should be encoded as follows. The change would be wrong.

See http://www.w3.org/TR/xml/

The ampersand character (&) and the left angle brackets (<) MUST NOT be displayed in their literal form, unless they are used as markup delimiters or in a comment, processing instruction, or CDATA section. If needed elsewhere, they MUST be escaped using either numeric characters or &amp; and &lt; respectively.

and http://www.w3.org/TR/xhtml1/#C_12

In both SGML and XML, the ampersand character ("&") declares the beginning of an entity reference (for example, &reg; for the registered trademark symbol "®"). Unfortunately, many HTML user agents silently ignored the misuse of the ampersand character in HTML documents — handling ampersands that are not like entity references as literal ampersands. XML-based user agents will not tolerate this misuse, and any document that uses ampersands incorrectly will not be "valid" and therefore will not conform to this specification. To ensure that documents are compatible with historical HTML user agents and XML-based user agents, ampersands used in the document that must be processed with literal characters must be expressed as an entity reference (for example, &amp; ). For example, when the href attribute of element a refers to a CGI script that accepts parameters, it should be expressed as http://my.site.dom/cgi-bin/myscript.pl?class=guest&amp;name=user , and not as http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user

+4
source

As Gordon said, URIs are encoded this way. If you have not encoded & to &amp; , the XML file will be corrupted - you will get errors parsing it. When you take a string back from an XML file, if & is still displayed, or str_replace() looks like this:

$str = str_replace('&amp;', '&', $str)

Or use htmlspecialchars_decode() :

$str = htmlspecialchars_decode($str);

The added bonus of using htmlspecialchars_decode() is that it will decrypt any other HTML code that may be in the string. See here for more details.

+2
source

Source: https://habr.com/ru/post/890715/


All Articles