How to remove html special characters?

I am creating an RSS feed file for my application in which I want to remove HTML tags that are executed using strip_tags . But strip_tags does not remove special HTML characters:

   & © 

and etc.

Please tell me any function that I can use to remove these special code characters from my string.

+46
php html-encode
Mar 18 '09 at 10:09
source share
13 answers

Either decode them using html_entity_decode , or delete them using preg_replace :

 $Content = preg_replace("/&#?[a-z0-9]+;/i","",$Content); 

(From here )

EDIT: Alternative according to Jacco comment

it may be nice to replace '+' with {2.8} or something like that. This will limit the chance to replace all offers when an unencoded '&' is present.

 $Content = preg_replace("/&#?[a-z0-9]{2,8};/i","",$Content); 
+94
Mar 18 '09 at 10:16
source share

Use html_entity_decode to convert HTML objects.

You need to set the encoding for it to work correctly.

+18
Mar 18 '09 at 10:15
source share

In addition to the good answers above, PHP also has a built-in filter function that is quite useful: filter-var.

To remove HMTL characters, use:

$cleanString = filter_var($dirtyString, FILTER_SANITIZE_STRING);

Additional Information:

+16
Feb 16 '12 at 16:59
source share

You can look at htmlentities () and html_entity_decode () here

 $orig = "I'll \"walk\" the <b>dog</b> now"; $a = htmlentities($orig); $b = html_entity_decode($a); echo $a; // I'll &quot;walk&quot; the &lt;b&gt;dog&lt;/b&gt; now echo $b; // I'll "walk" the <b>dog</b> now 
+7
Mar 18 '09 at 10:16
source share

This may work to remove special characters.

 $modifiedString = preg_replace("/[^a-zA-Z0-9_.-\s]/", "", $content); 
+2
Mar 29 '13 at 9:58 on
source share

What I did is use: html_entity_decode , then use strip_tags to remove them.

+2
Dec 16 '13 at 15:36
source share

try it

 <?php $str = "\x8F!!!"; // Outputs an empty string echo htmlentities($str, ENT_QUOTES, "UTF-8"); // Outputs "!!!" echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8"); ?> 
+2
Mar 11 '14 at 4:11
source share

A simple vanilla cord way to do this without involving a regx preg engine:

 function remEntities($str) { if(substr_count($str, '&') && substr_count($str, ';')) { // Find amper $amp_pos = strpos($str, '&'); //Find the ; $semi_pos = strpos($str, ';'); // Only if the ; is after the & if($semi_pos > $amp_pos) { //is a HTML entity, try to remove $tmp = substr($str, 0, $amp_pos); $tmp = $tmp. substr($str, $semi_pos + 1, strlen($str)); $str = $tmp; //Has another entity in it? if(substr_count($str, '&') && substr_count($str, ';')) $str = remEntities($tmp); } } return $str; } 
+1
Mar 18 '09 at 11:19
source share

It looks like you really want:

 function xmlEntities($string) { $translationTable = get_html_translation_table(HTML_ENTITIES, ENT_QUOTES); foreach ($translationTable as $char => $entity) { $from[] = $entity; $to[] = '&#'.ord($char).';'; } return str_replace($from, $to, $string); } 

It replaces named objects with their number equivalent.

+1
Mar 18 '09 at 16:21
source share
 <?php function strip_only($str, $tags, $stripContent = false) { $content = ''; if(!is_array($tags)) { $tags = (strpos($str, '>') !== false ? explode('>', str_replace('<', '', $tags)) : array($tags)); if(end($tags) == '') array_pop($tags); } foreach($tags as $tag) { if ($stripContent) $content = '(.+</'.$tag.'[^>]*>|)'; $str = preg_replace('#</?'.$tag.'[^>]*>'.$content.'#is', '', $str); } return $str; } $str = '<font color="red">red</font> text'; $tags = 'font'; $a = strip_only($str, $tags); // red text $b = strip_only($str, $tags, true); // text ?> 
+1
Jul 10 2018-10-10T00:
source share

The function that I used to complete the task joins the update executed by schnaader:

  mysql_real_escape_string( preg_replace_callback("/&#?[a-z0-9]+;/i", function($m) { return mb_convert_encoding($m[1], "UTF-8", "HTML-ENTITIES"); }, strip_tags($row['cuerpo']))) 

This function removes all html and html tags converted to UTF-8, ready to be saved in MySQL

+1
Jul 14 '11 at 15:08
source share

You can try htmlspecialchars_decode($string) . This works for me.

http://www.w3schools.com/php/func_string_htmlspecialchars_decode.asp

0
01 Oct '15 at 12:56
source share
 $string = "äáčé"; $convert = Array( 'ä'=>'a', 'Ä'=>'A', 'á'=>'a', 'Á'=>'A', 'à'=>'a', 'À'=>'A', 'ã'=>'a', 'Ã'=>'A', 'â'=>'a', 'Â'=>'A', 'č'=>'c', 'Č'=>'C', 'ć'=>'c', 'Ć'=>'C', 'ď'=>'d', 'Ď'=>'D', 'ě'=>'e', 'Ě'=>'E', 'é'=>'e', 'É'=>'E', 'ë'=>'e', ); $string = strtr($string , $convert ); echo $string; //aace 
-one
May 13 '15 at 11:32
source share



All Articles