Best way to misinform content using PHP?

What is the best way to “disinfect” content? Example...

Example - before disinfection:

Morbi mollis ante vitae massa suscipit a tempus est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur. Morbi mollis ante vitae est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur. 

An example is after disinfection:

 <p>Morbi mollis ante vitae massa suscipit a tempus est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur.</p> <p>Morbi mollis ante vitae est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur.</p> 

What should he do

  • It should add p tags instead of line breaks.
  • It should remove the empty space like triplex spaces
  • It should remove double line breaks.
  • He should remove the tabs.
  • It should remove line breaks and spaces before the content, if any.
  • It should remove line breaks and spaces after the content, if any.

Do I know correctly that I am using the str_replace function, and this should be the best solution for this?

I want the function to look like this:

 function sanitize($content) { // Do the magic! return $content; } 
+4
source share
4 answers
  • It should add p tags instead of line breaks.

Run it through something like a Textile or Markdown interpreter or any other humane markup language that suits your needs.

  • It should remove the empty space like triplex spaces
  • It should remove double line breaks.
  • He should remove the tabs.
  • It should remove line breaks and spaces before the content, if any.
  • It should remove line breaks and spaces after the content, if any.

Why bother? When HTML is displayed as a document, several space characters are reduced to one place, no? Most of your problems are solved by yourself.

+6
source
 function sanitize($content) { // leading white space $content = preg_replace('!^\s+!m', '', $content); // trailing white space $content = preg_replace('![ \t]+$!m', '', $content); // tabs and multiple white space $content = preg_replace('![ \t]+!', ' ', $content); // multiple newlines $content = preg_replace('![\r\n]+!', "\n", $content); // paragraphs $content = preg_replace('!(.+)!m', '<p>$1</p>', $content); // done return $content; } 

Example:

 $s = <<<END Morbi mollis ante vitae massa suscipit a tempus est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur. Morbi mollis ante vitae est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur. END; $out = sanitize($s); 

Conclusion:

 <p>Morbi mollis ante vitae massa suscipit a tempus est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur.</p> <p>Morbi mollis ante vitae est pellentesque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Nulla mattis iaculis consectetur.</p> 
+6
source

Take a look at the CakePHP Sanitize class.

+3
source

Tidy !!

There is a rather outdated article on zend, but look at the example they give:

http://devzone.zend.com/article/761

+1
source

Source: https://habr.com/ru/post/1304695/


All Articles