Problem checking PHP and RSS and special characters

I keep getting the following validation warning below. And I was wondering that some of my articles are about special characters, and I was wondering how can I render or not show special characters in my RSS feeds? Should I use htmlentites or not? If so, how?

In addition, compatibility with the widest range of channel readers can be improved by implementing the following recommendations. row 22, column 35: title must not contain HTML: &

PHP code.

 <title>' . htmlentities(strip_tags($title), ENT_QUOTES, "UTF-8") . '</title> 
+3
source share
3 answers

You must use CDATA. To avoid characters in your XML feeds, this allows you to use your raw data without breaking the XML layout.

Try the following:

 <title><![CDATA[ YOUR RAW CONTENT]]></title> 

Note. Do not use htmlentites and strip_tags, as this will avoid them for the browser, and any other reader should read them correctly.

Qoute from w3schools:

The term CDATA is used for text data that should not be parsed by an XML parser. Characters such as "<" and "&" are illegal in XML elements. "<" generates an error because the parser interprets it as the beginning of a new element. "&" generates an error because the parser interprets it as the beginning of a character entity. Some texts, such as JavaScript code, contain many "<" or "&" characters. To avoid script errors, the code can be defined as CDATA. The parser ignores the entire contents of the CDATA section. The CDATA section begins with "":

http://www.w3schools.com/xml/xml_cdata.asp

+1
source

/ * feedvalidator.org (Feedburner recommends this site to validate your feeds) says: "For the widest interop, the RSS Profile recommends the use of the hexadecimal character reference" & "to represent" & "and" <"to represent" < ". * /

  // find title problems $find[] = '<'; $find[] = '\x92'; $find[] = '\x84'; // find content problems $find_c[] = '\x92'; $find_c[] = '\x84'; $find_c[] = '&nbsp;'; // replace title $replace[] = '&#x3C;'; $replace[] = '&#39;'; $replace[] = '&#34;'; // replace content $replace_c[] = '&#39;'; $replace_c[] = '&#34;'; $replace_c[] = ' '; // We don't want to re-replace "&" characters. // So do this first because of PHP "feature" https://bugs.php.net/bug.php?id=33773 $title = str_replace('&', '&#x26;', $title); $title = str_replace($find, $replace, $title); $post_content = str_replace($find_c, $replace_c, $row[3]); // http://productforums.google.com/forum/#!topic/merchant-center/nIVyFrJsjpk $link = str_replace('&', '&amp;', $link); 

Of course I do preprocessing until $ title, $ post_content and $ link are added to my database. But this should help solve some common problems in order to get reliable RSS feed.

Update: Fixed the & # x26; # x26; # x26; "recursion" problem, see https://bugs.php.net/bug.php?id=33773

+1
source

Take out htmlentities() . This is for HTML files only.

0
source

Source: https://habr.com/ru/post/1444689/


All Articles