Removing nested tags with simpleHTML

I am trying to use simple_html_dom to remove all gaps from an HTML fragment, and I am using the following:

$body = "<span class='outer' style='background:red'>x<span class='mid' style='background:purple'>y<span class='inner' style='background:orange'>z</span></span></span>";
$HTML = new simple_html_dom;
$HTML->load($body);   
$spans = $HTML->find('span');
foreach($spans as $span_tag) {
    echo "working on ". $span_tag->class . " ... ";
    echo "setting " . $span_tag->outertext . " equal to " . $span_tag->innertext . "<br/>\n";
    $span_tag->outertext = (string)$span_tag->innertext;
}
$text =  $HTML->save();
$HTML->clear();
unset($HTML);
echo "<br/>The Cleaned TEXT is: $text<br/>";

And here is the result in my browser:

http://www.pixeloution.com/RAC/clean.gif

So why did I just finish deleting the outermost range?

Edit

In fact, if there is an easier way to do this, I play. The goal is to remove the tags, but keep something inside them, including other tags, otherwise I would just use $ obj-> paintext

Edit # 2

Well ... apparently, I got his job, although, oddly enough, I still would like to understand the problem if someone came across this before. Knowing that this only removes the most distant range, I did this:

function cleanSpansRecursive(&$body) {

    $HTML = new simple_html_dom;
    $HTML->load($body); 
    $spans = $HTML->find('span');
    foreach($spans as $span_tag) {
        $span_tag->outertext = (string)$span_tag->innertext;
    }

    $body =  (string)$HTML;
    if($HTML->find('span')) {
        $HTML->clear();
        unset($HTML);
        cleanSpansRecursive($body);
    } else {
        $HTML->clear();
        unset($HTML);
    }  
}

And it works.

+3
1

simple_html_dom, dev-, , $span_tag->outertext span , $HTML. , .

: , , .

+1

Source: https://habr.com/ru/post/1730725/


All Articles