Preg_replace only OUTSIDE tags? (... we are not talking about the full "html analysis", just a little markdown)

What is the easiest way to apply highlighting to some text, excluding text in OCCASIONAL tags "<...>"?

CLARIFICATION . I want existing tags to be SAVED!

$t = 
preg_replace(
  "/(markdown)/",
  "<strong>$1</strong>",
"This is essentially plain text apart from a few html tags generated with some
simplified markdown rules: <a href=markdown.html>[see here]</a>");

What should be displayed as:

"This is essentially plain text, except for a few html tags generated with some simplified markdown rules : see here "

... BUT DO NOT COMMUNICATE the text inside the anchor tag (i.e. <a href=markdown.html>).

I have heard the arguments of not parsing html with regular expressions, but here we are talking about plain text, with the exception of minimal parsing of some markup code.

+3
5

:

<?php
$item="markdown";
$t="This is essentially plain text apart from a few html tags generated 
with some simplified markdown rules: <a href=markdown.html>[see here]</a>";

//_____1. apply emphasis_____
$t = preg_replace("|($item)|","<strong>$1</strong>",$t);

// "This is essentially plain text apart from a few html tags generated 
// with some simplified <strong>markdown</strong> rules: <a href=
// <strong>markdown</strong>.html>[see here]</a>"

//_____2. remove emphasis if WITHIN opening and closing tag____
$t = preg_replace("|(<[^>]+?)(<strong>($item)</strong>)([^<]+?>)|","$1$3$4",$t);

// this preserves the text before ($1), after ($4) 
// and inside <strong>..</strong> ($2), but without the tags ($3)

// "This is essentially plain text apart from a few html tags generated
// with some simplified <strong>markdown</strong> rules: <a href=markdown.html>
// [see here]</a>"

?>

, $item="odd|string", , ​​... (, htmlentities (...) ...)

+2

/ preg_split:

$parts = preg_split('/(<(?:[^"\'>]|"[^"<]*"|\'[^\'<]*\')*>)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);

, (.. ) :

for ($i=0, $n=count($parts); $i<$n; $i+=2) {
    $parts[$i] = preg_replace("/(markdown)/", "<strong>$1</strong>", $parts[$i]);
}

implode:

$str = implode('', $parts);

, . HTML-, , DOM PHP. . , :

+1

This regular expression should remove all opening and closing HTML tags: /(<[.*?]>)+/

You can use it with preg_replace as follows:

$test = "Hello <strong>World!</strong>";
$regex = "/(<.*?>)+/";


$result = preg_replace($regex,"",$test);
0
source

You can split your string into an array on each '<' or '>' using preg_split(), then skip that array and replace only entries not starting with '>'. Then you concatenate the array into a string with implode().

0
source

it's actually not very effective, but it worked for me

$your_string = '...';

$search = 'markdown';
$left = '<strong>';
$right = '</strong>';

$left_Q = preg_quote($left, '#');
$right_Q = preg_quote($right, '#');
$search_Q = preg_quote($search, '#');
while(preg_match('#(>|^)[^<]*(?<!'.$left_Q.')'.$search_Q.'(?!'.$right_Q.')[^>]*(<|$)#isU', $your_string))
  $your_string = preg_replace('#(^[^<]*|>[^<]*)(?<!'.$left_Q.')('.$search_Q.')(?!'.$right_Q.')([^>]*<|[^>]*$)#isU', '${1}'.$left.'${2}'.$right.'${3}', $your_string);

echo $your_string;
0
source

Source: https://habr.com/ru/post/1783609/


All Articles