Regular expression to get class name from html

I know that my question may look like duplication for this question , but it doesn’t
I am trying to match the class name inside html text , which comes from the server as a template using JavsScript RegExp and replaces it with another class name. this is what the code looks like:

<div class='abc d'></div> <!-- or --> <div class="abcd"></div> <!-- There might be spaces after and before the = (the equal sign) --> 

I want to combine class "b", for example, with top performance

Here is the regex that I used, but it doesn't work in all cases, and I don't know why:

  var key = 'b'; statRegex = new RegExp('(<[\w+ class="[\\w\\s]*?)\\b('+key+')\\b([\\w\\s]*")'); html.replace( statRegex,'SomeOtherClass');// I may be mistake by the way I am replacing it here 
+4
source share
5 answers

Using a regex, this pattern should work for you:

 var r = new RegExp("(<\\w+?\\s+?class\\s*=\\s*['\"][^'\"]*?\\b)" + key + "\\b", "i"); # Λ Λ Λ # |_________________________________________| | # ____________| | # [Creating a backreference] | # [which will be accessible] [Using "i" makes the matching "case-insensitive".]_| # [using $1 (see examples).] [You can omit "i" for case-sensitive matching. ] 

eg.

 var oldClass = "b"; var newClass = "e"; var r = new RegExp("..." + oldClass + "..."); "<div class='abc d'></div>".replace(r, "$1" + newClass); // ^-- returns: <div class='aec d'></div> "<div class=\"abcd\"></div>".replace(r, "$1" + newClass); // ^-- returns: <div class="aecd"></div> "<div class='abcd'></div>".replace(r, "$1" + newClass); // ^-- returns: <div class='abcd'></div> // <-- NO change 

Note:
To work on this regular expression, the class string should not have ' or " .
That is, <div class="a 'b' cd"... does NOT match.

+4
source

Take advantage of the browser:

 var str = '<div class=\'abcd\'></div>\ <!-- or -->\ <div class="abcd"></div>\ <!-- There might be spaces after and before the = (the equal sign) -->'; var wrapper = document.createElement('div'); wrapper.innerHTML = str; var elements = wrapper.getElementsByClassName('b'); if (elements.length) { // there are elements with class b } 

Demo

Btw, getElementsByClassName() not supported in IE prior to version 9; check this answer for an alternative.

+2
source

Regular expressions are not suitable for parsing HTML. HTML is not regular.

jQuery can be very good here.

 var html = 'Your HTML here...'; $('<div>' + html + '</div>').find('[class~="b"]').each(function () { console.log(this); }); 

The [class~="b"] selector will select any element with the class attribute containing the word b . The initial HTML is wrapped inside a div so that the find method works correctly.

+1
source

Test it here: https://regex101.com/r/vnOFjm/1

regexp: (?:class|className)=(?:["']\W+\s*(?:\w+)\()?["']([^'"]+)['"]

 const regex = /(?:class|className)=(?:["']\W+\s*(?:\w+)\()?["']([^'"]+)['"]/gmi; const str = `<div id="content" class="container"> <div style="overflow:hidden;margin-top:30px"> <div style="width:300px;height:250px;float:left"> <ins class="adsbygoogle turbo" style="display:inline-block !important;width:300px;min-height:250px; display: none !important;" data-ad-client="ca-pub-1904398025977193" data-ad-slot="4723729075" data-color-link="2244BB" qgdsrhu="" hidden=""></ins> <img src="http://static.teleman.pl/images/pixel.gif?show,753804,20160812" alt="" width="0" height="0" hidden="" style="display: none !important;"> </div>`; let m; while ((m = regex.exec(str)) !== null) { // This is necessary to avoid infinite loops with zero-width matches if (m.index === regex.lastIndex) { regex.lastIndex++; } // The result can be accessed through the `m`-variable. m.forEach((match, groupIndex) => { console.log(`Found match, group ${groupIndex}: ${match}`); }); } 
+1
source

This may not be the solution for you, but if you are not using full regex matching, you can do it (assuming your examples are representative of the data you will be analyzing)

 function hasTheClass(html_string, classname) { //!!~ turns -1 into false, and anything else into true. return !!~html_string.split("=")[1].split(/[\'\"]/)[1].split(" ").indexOf(classname); } hasTheClass("<div class='abc d'></div>", 'b'); //returns true 
0
source

Source: https://habr.com/ru/post/1480866/


All Articles