DOM Parser for highlighting keywords that do not work

This question is related to the one I did before , but since the topic is closed, and I need to ask something, I will start a new question, hoping that it’s good.

In my previous answer, I simplified the problem and led to simple but not fully working solutions. I realized this these days when I was executing my code.

The problem with the solutions in the previous article is that the HTML tags are broken down into replacement functions. I have read in many posts on this site that I need to use DOM Parser. I am very unfamiliar with this and I tried the code suggested by the user "ircmaxell" in this post, but it does not work for me.

Here is an example of what I did:

echo '<style type="text/css"> .ht{ background-color: yellow; } </style>'; /* taken from user ircmaxell at /questions/1397120/highlight-keywords-in-a-paragraph I just modified line $highlight->setAttribute('class', 'highlight') to $highlight->setAttribute('class', 'ht') and commented the first 2 lines */ function highlight_paragraph($string, $keyword) { //$string = '<p>foo<b>bar</b></p>'; //$keyword = 'foo'; $dom = new DomDocument(); $dom->loadHtml($string); $xpath = new DomXpath($dom); $elements = $xpath->query('//*[contains(.,"'.$keyword.'")]'); foreach ($elements as $element) { foreach ($element->childNodes as $child) { if (!$child instanceof DomText) continue; $fragment = $dom->createDocumentFragment(); $text = $child->textContent; $stubs = array(); while (($pos = stripos($text, $keyword)) !== false) { $fragment->appendChild(new DomText(substr($text, 0, $pos))); $word = substr($text, $pos, strlen($keyword)); $highlight = $dom->createElement('span'); $highlight->appendChild(new DomText($word)); $highlight->setAttribute('class', 'ht'); $fragment->appendChild($highlight); $text = substr($text, $pos + strlen($keyword)); } if (!empty($text)) $fragment->appendChild(new DomText($text)); $element->replaceChild($fragment, $child); } } $string = $dom->saveXml($dom->getElementsByTagName('body')->item(0)->firstChild); return $string; } $string = '<p>This book has been written against a background of both reckless optimism and reckless despair.</p> <p>It holds that Progress and Doom are two sides of the same medal; that both are articles of superstition, not of faith. It was written out of the conviction that it should be possible to discover the hidden mechanics by which all traditional elements of our political and spiritual world were dissolved into a conglomeration where everything seems to have lost specific value, and has become unrecognizable for human comprehension, unusable for human purpose.</p> <p> Hannah Arendt, The Origins of Totalitarianism (New York: Harcourt Brace Jovanovich, Inc., 1973 ed.), p.vii, Preface to the First Edition.</p>'; $keywords = array('This', 'book', 'has', 'been', 'written', 'background', 'reckless', 'optimism', 'despair.', 'holds', 'Progress', 'Doom ', 'two', 'sides', 'medal;', 'articles', 'superstition,', 'faith.', 'lost', 'Arendt,', 'Totalitarianism'); foreach ($keywords as $kw) { $string = highlight_paragraph($string, $kw); } echo $string; 
Returns

echo $ string:

 This book has been written against a background of both reckless optimism and reckless despair. 

And only the first two words 'This' and 'book' stand out.

Usually it should print the entire start line with the keywords highlighted.

I searched a lot on stackoverflow and google and did not find easy to use code to achieve my goal, even if there are a lot of people who used to ask the same thing.

I really need help here. Thanks in advance!

+4
source share
1 answer

You are lucky that I was very bored when I saw this question .;)

The code you received as the answer did not seem to be tested - I do not know how it could work correctly. In any case, I fixed all the problems and presented you with a working version - tested on a locally installed Apache Server with PHP 5.3:

 function highlight_paragraph($string, $keyword) { $dom = new DOMDocument(); $dom->loadHtml($string); // Search for all text blocks containing the keyword $xpath = new DOMXpath($dom); $textNodes = $xpath->query('//*[contains(.,"'.$keyword.'")]/text()'); foreach ($textNodes as $textNode) { $fragment = $dom->createDocumentFragment(); $text = $textNode->nodeValue; $stubs = array(); while (($pos = stripos($text, $keyword)) !== false) { $fragment->appendChild(new DOMText(substr($text, 0, $pos))); $word = substr($text, $pos, strlen($keyword)); $highlight = $dom->createElement('span'); $highlight->appendChild(new DOMText($word)); $highlight->setAttribute('class', 'ht'); $fragment->appendChild($highlight); $text = substr($text, $pos + strlen($keyword)); } if (!empty($text)) $fragment->appendChild(new DOMText($text)); $textNode->parentNode->replaceChild($fragment, $textNode); } return $dom->saveHTML(); } 
+7
source

Source: https://habr.com/ru/post/1397111/


All Articles