How to fine-tune html objects?

Question

How to fine-tune html objects?

I like the following:

$mytext="that&#039;s really &quot;confusing&quot; and &lt;absolutly&gt; silly"; echo substr($mytext,0,6);

The output in this case would be: that&# instead of that's

What I want is to count the html objects as 1 character and then substr, because I always end up with broken html or some obscure characters at the end of the text.

Please do not suggest me html decode it, then substr then encode it, I want a clean method :)

thanks

+4

php html-entities

Emily Apr 17 '10 at 4:15

source share

6 answers

cletus · Answer 1 · 2010-04-17T04:45:46+0000

There are two ways to do this:

You can decode HTML objects, substr() and then encode; or
You can use regex.

(1) uses html_entity_decode() and htmlentities() :

 $s = html_entity_decode($mytext); $sub = substr($s, 0, 6); echo htmlentities($sub);

(2) might look something like this:

 if (preg_match('!^([^&]|&(?:.*?;)){0,5}!s', $mytext, $match)) { echo $match[0]; }

What it is: find up to 5 occurrences of the previous expression from the beginning of the line. Previous expression:

any character that is not an ampersand; or
an ampersand followed by everything, even a semi-colony (including an HTML object).

This is not ideal, so I would prefer (1).

Syntax Error · Answer 2 · 2010-04-17T05:19:36+0000

 function encoded_substr($string, $param, $param2){ $s = html_entity_decode($string); $sub = substr($s, $param, $param2); return htmlentities($sub); }

There, I copied cletus' code into a function for you. Now you can call a very simple 3-line function with 1 line of code. If it is not “pure,” I am confused by what “pure” means.

Rasmus · Answer 3 · 2015-05-19T10:28:28+0000

Note that some characters violate the proposed decoding + encoding if you use substr() .

Example

 $string=html_entity_decode("Workin&#8217; on my Fitness&#8230;In the Backyard."); echo $string; echo substr($string,0,25); echo htmlentities(substr($string,0,25));

Conclusion:

Work out at my fitness ... In the backyard.
Work out on my fitness
(empty line)

Decision

Use mb_substr() .

 echo mb_substr($string,0,25); echo htmlentities(mb_substr($string,0,25));

Conclusion:

Work on my Fitness ... In
Work ’ on my fitness … IN

iCLIENT Technoloogies · Answer 4 · 2012-12-01T11:13:37+0000

Try using the following encoding functions.

 <?php $mytext="that&#039;s really &quot;confusing&quot; and &lt;absolutly&gt; silly"; echo limit_text($tamil_var,6); function limit_text($text,$limit){ preg_match_all("/&(.*)\;/U", $text, $pat_array); $additional=0; foreach ($pat_array[0] as $key => $value) { if($key <$limit){$additional += (strlen($value)-1);} } $limit+=$additional; if(strlen($text)>$limit){ $text = substr( $text,0,$limit ); $text = substr( $text,0,-(strlen(strrchr($text,' '))) ); } return $text; } ?>

Your common sense · Answer 5 · 2010-04-17T05:31:41+0000

Well, there is only one pure method: Do not use entities at all.
There is more than one reason for substr substring. It can only be used for output.
So, fine-tune first, then encode.

Stélio inácio · Answer 6 · 2018-03-07T20:35:42+0000

Here is the fix for the syntax error code, use mb_substr to avoid surprises such as an html object with fewer characters, or character counting doesn't work as it should, in my case Sábado becomes Sá:

 function encoded_substr($string, $param, $param2){ $s = html_entity_decode($string); $sub = mb_substr($s, $param, $param2); return htmlentities($sub); }

How to fine-tune html objects?

Example

Conclusion:

Decision

Conclusion:

More articles: