Regex to search for html return but not actual html jQuery

I am making a highlight plugin so that the client can find something on the page, and decided to check it with the help viewer im still building , but I had a problem, probably) require some regular expression.

I don't want to parse HTML, and I fully disclose how to do it differently, it just seems like the best / right way.

http://oscargodson.com/labs/help-viewer

http://oscargodson.com/labs/help-viewer/js/jquery.jhighlight.js

Enter something in the search ... ok, refresh the page, now enter, for example, class or class=" or enter <a , you will notice that it will search for the actual HTML (as expected). How can I search only for text ?

If I do .text() , it will evaporate all the HTML, and what I return will be just a big block of text, but I still want HTML, so I do not lose formatting, links, images, etc. I want this to work as CMD / CTRL + F.

You should use this plugin as:

$('article').jhighlight({find:'class'});

To remove them:

.jhighlight('remove')

== UPDATE ==

Although Mike Samuel's idea below actually works, it's a bit heavy for this plugin. This is mainly for a client who wants to erase bad words and / or MS Word characters during the "publication" of the form process. I'm looking for an easier fix, any ideas?

+4
source share
5 answers

You really don't want to use eval, mess with innerHTML or parse markup "manually." The best way, in my opinion, is to directly deal with text nodes and save the cache of the original html to erase the main points. Fast rewriting with comments:

 (function($){ $.fn.jhighlight = function(opt) { var options = $.extend($.fn.jhighlight.defaults, opt) , txtProp = this[0].textContent ? 'textContent' : 'innerText'; if ($.trim(options.find.length) < 1) return this; return this.each(function(){ var self = $(this); // use a cache to clear the highlights if (!self.data('htmlCache')) self.data('htmlCache', self.html()); if(opt === 'remove'){ return self.html( self.data('htmlCache') ); } // create Tree Walker // https://developer.mozilla.org/en/DOM/treeWalker var walker = document.createTreeWalker( this, // walk only on target element NodeFilter.SHOW_TEXT, null, false ); var node , matches , flags = 'g' + (!options.caseSensitive ? 'i' : '') , exp = new RegExp('('+options.find+')', flags) // capturing , expSplit = new RegExp(options.find, flags) // no capturing , highlights = []; // walk this wayy // and save matched nodes for later while(node = walker.nextNode()){ if (matches = node.nodeValue.match(exp)){ highlights.push([node, matches]); } } // must replace stuff after the walker is finished // otherwise replacing a node will halt the walker for(var nn=0,hln=highlights.length; nn<hln; nn++){ var node = highlights[nn][0] , matches = highlights[nn][1] , parts = node.nodeValue.split(expSplit) // split on matches , frag = document.createDocumentFragment(); // temporary holder // add text + highlighted parts in between // like a .join() but with elements :) for(var i=0,ln=parts.length; i<ln; i++){ // non-highlighted text if (parts[i].length) frag.appendChild(document.createTextNode(parts[i])); // highlighted text // skip last iteration if (i < ln-1){ var h = document.createElement('span'); h.className = options.className; h[txtProp] = matches[i]; frag.appendChild(h); } } // replace the original text node node.parentNode.replaceChild(frag, node); }; }); }; $.fn.jhighlight.defaults = { find:'', className:'jhighlight', color:'#FFF77B', caseSensitive:false, wrappingTag:'span' }; })(jQuery); 

If you are doing any kind of manipulation on the page, you might want to replace caching with another cleaning mechanism, but not trivial.

You can see the code that works here: http://jsbin.com/anace5/2/

You also need to add display: block to your new html elements, the layout is split into several browsers.

+2
source

In the javascript code prefix, I had this problem. I wanted to find the text, but save the tags.

What I did started with HTML and decomposed it into two bits.

  • Text content
  • Pairs (index into text content, where the tag occurs, tag content)

So, given

 Lorem <b>ipsum</b> 

The end result

 text = 'Lorem ipsum' tags = [6, '<b>', 10, '</b>'] 

which allows me to search in the text, and then based on the source and destination indexes of the result, create HTML, including only tags (and only balanced tags) in this range.

0
source

Take a look here: the equivalent of getElementsByTagName () for text nodes . You can probably adapt one of the proposed solutions to your needs (i.e., iterate over all text nodes, replacing words when you go - this will not work in cases like <tag>wo</tag>rd , but it better than nothing, I think).

0
source

I think you could just do:

 $('#article :not(:has(*))').jhighlight({find : 'class'}); 

Since it captures all leaf nodes in the article, it will require a valid xhtml for it, i.e. it will only match link in the following example:

 <p>This is some paragraph content with a <a href="#">link</a></p> 

The DOM traversal / selector application can slow things down a bit, so it can be useful:

 article_nodes = article_nodes || $('#article :not(:has(*))'); article_nodes.jhighlight({find : 'class'}); 
0
source

Maybe something like that might be useful

 >+[^<]*?(s(<[\s\S]*?>)?e(<[\s\S]*?>)?e)[^>]*?<+ 

The first part >+[^<]*? finds > last previous tag

The third part [^>]*?<+ Finds < first subsequent tag

In the middle we have (<[\s\S]*?>)? between the characters of our search phrase (in this case, “see”).

After searching for regular expressions, you can use the result of the middle part to highlight the search phrase for the user.

0
source

Source: https://habr.com/ru/post/1344005/


All Articles