Why restore html from inside NOSCRIPT returns htmlentities?

Given the code:

<noscript><div>FOO</div></noscript>

Launch

$('noscript').html();

returns &lt;div&gt;FOO&lt;/div&gt;

but it works

$('noscript').text();

returns raw html.

This is the opposite of what I expected. Is there any explanation for this?

+3
source share
3 answers

This is more of a DOM bizarre than quikk jQuery:

$("<noscript><div>FOO</div></noscript>")[0].innerHTML == "&lt;div&gt;FOO&lt;/div&gt;"

$("<noscript><div>FOO</div></noscript>")[0].textContent == "<div>FOO</div>"

Basically, the behavior for this action is incompatible, as this answer explains .

+3
source

There seems to be a PhantomJS bug that seems to avoid entities in the noscript tags in page.content. This function will return them to their legal form. The S object is in the string package available at npmjs.org.

function fixNoScript(content) {
  var noscript = /<\s*noscript\s*>([^<]+)<\s*\/\s*noscript\s*>/ig;
  var matches = content.match(noscript);
  for ( var i = 0; match && i < matches.length; i++ ) {
    var decoded = S(matches[i]).decodeHTMLEntities().s;
    var index = content.indexOf(matches[i]);
    content = content.substring(0, index) + 
              decoded +
              content.substring(index + matches[i].length);

  }
  return content;
}
+5
source

yep, this is completely inconsistent as htmlentities are destroyed. & a copy is always converted to a copyright symbol in the BOTH functions. Basically, there should be a function that simply extracts html as it is, without any conversions.

0
source

Source: https://habr.com/ru/post/1707740/


All Articles