JavaScript: Given the offset and length of the substring in the HTML string, what is the parent node?

Question

JavaScript: Given the offset and length of the substring in the HTML string, what is the parent node?

In my current project, I need to find an array of strings in the text content of the element, and then wrap the corresponding strings in the <a> elements using JavaScript (the requirements are simplified here for clarity). I need to avoid jQuery, if at all possible - at least including the full library.

For example, this HTML block:

 <div> <p>This is a paragraph of text used as an example in this Stack Overflow question.</p> </div>

and this array of strings corresponds to:

 ['paragraph', 'example']

I would need to come to the following:

 <div> <p>This is a <a href="http://www.example.com/">paragraph</a> of text used as an <a href="http://www.example.com/">example</a> in this Stack Overflow question.</p> </div>

I came to the solution of this using the innerHTML() method and some string manipulations - mainly using offsets (via indexOf() ) and the lengths of the lines in the array to split the HTML string into the corresponding character offsets and insert <a href="http://www.example.com/"> tags <a href="http://www.example.com/"> and </a> where necessary.

However, the additional requirement worries me. I am not allowed to wrap any matched strings in <a> elements if they are already in one, or if they are descendants of the header element ( <h1> to <h6> ).

So, given the same array of lines above and this HTML block (by the way, the term match must be case insensitive):

 <div> <h1>Example</a> <p>This is a <a href="http://www.example.com/">paragraph of text</a> used as an example in this Qaru question.</p> </div>

I would have to ignore both the appearance of the Example in the <h1> element and the paragraph in the <a href="http://www.example.com/">paragraph of text</a> .

This tells me that I need to determine which of the nodes corresponds to each row, and then cross its ancestors until I press <body> , checking if I encounter the <a> or <h_> node along the way.

Firstly, does that sound reasonable? Is there a simpler or more obvious approach that I have not considered? It doesn't seem like regular expressions or other string-based comparisons to find bounding tags will be reliable - I think of issues like self-closing elements, irregularly nested tags, etc. Also, this is ...

Secondly, is this possible, and if so, how would I approach it?

+4

javascript string regex

Bungle May 09 '10 at 4:02

source share

2 answers

Take a look at the jQuery Highlight plugin . It does almost what you need, because you need a link and only the first appearance of each word. Its source code is extremely simple, so there should not be too much work to make it work (even if you are not using jQuery it can help you a lot - it does not use jQuery inside, only to select DOM elements).

+1

Kobi May 09 '10 at 5:24

source share

rob · Accepted Answer · 2010-05-09T05:23:28+0000

You will probably have to iterate over the dom elements. Here's a simple recursive dom iterator, you can fill in the rest:

 function iterateDom (node) { switch (node.nodeType) { case 1: // ELEMENT_NODE { if (node.tagName != "H1") { for (var i=0; i<node.childNodes.length; i++) iterateDom(node.childNodes[i]); } } break; case 3: //TEXT_NODE { // node.nodeValue = node.nodeValue.replace(...); break; } return true; }

JavaScript: Given the offset and length of the substring in the HTML string, what is the parent node?

More articles: