Php / regex: "linkify" blog headers

I am trying to write a simple PHP function that can take a string like

Topic: Some stuff, Maybe some more, it my stuff?

and return

topic-some-stuff-maybe-some-more-its-my-stuff

In this way:

  • lowercase
  • delete all non-letter characters without spaces
  • replace all spaces (or groups of spaces) with hyphens

Is it possible to do this with a single regex?

+3
source share
4 answers
 function Slug($string) { return strtolower(trim(preg_replace('~[^0-9a-z]+~i', '-', html_entity_decode(preg_replace('~&([az]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8')), ENT_QUOTES, 'UTF-8')), '-')); } $topic = 'Iñtërnâtiônàlizætiøn'; echo Slug($topic); // internationalizaetion $topic = 'Topic: Some stuff, Maybe some more, it\ my stuff?'; echo Slug($topic); // topic-some-stuff-maybe-some-more-it-s-my-stuff $topic = 'here عربي‎ Arabi'; echo Slug($topic); // here-arabi $topic = 'here 日本語 Japanese'; echo Slug($topic); // here-japanese 
+3
source

Why regular expressions are considered a universal panacea for all life's problems (simply because the low feedback in preg_match discovered a cure for cancer). here's a solution without resorting to regex:

 $str = "Topic: Some stuff, Maybe some more, it my stuff?"; $str = implode('-',str_word_count(strtolower($str),2)); echo $str; 

Without moving the entire UTF-8 route:

 $str = "Topic: Some stuff, Maybe some more, it my Iñtërnâtiônàlizætiøn stuff?"; $str = implode('-',str_word_count(strtolower(str_replace("'","",$str)),2,'Þßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ')); echo $str; 

gives

topic-some-material, perhaps a little more-his-mine-iñtërnâtiônàlizætiøn-material

+2
source

You can do this with one preg_replace :

 preg_replace(array("/[AZ]/e", "/\\p{P}/", "/\\s+/"), array('strtolower("$0")', '', '-'), $str); 

Technically, you can do this with a single regex, but it's easier.

The preventive answer is: yes, it unnecessarily uses regular expressions (albeit very simple), an unreasonably large number of calls to strtolower , and it does not take into account non-English characters (it does not even give an encoding); I just meet the OP requirements.

+2
source

Source: https://habr.com/ru/post/1389867/


All Articles