Javascript DOM, get text node without losing distance information

Question

Javascript DOM, get text node without losing distance information

I use javascript and want to go through the HTML tree, getting all the text as it seems to the user. However, I am losing distance information.

Let's say I have two documents:

<html>XXX<p>YY YY</p><html> <html>XXX<p>YY&nbsp;&nbsp;&nbsp;YY</p><html>

The first will appear with 1 space between Ys. The second will have 3 spaces. However, if I cross the tree and for each #text node use:

 text = node.nodeValue;

then the text for both nodes will have 3 spaces. I no longer know which one has the "real" nbsp spaces. I can use node.innerHTML for p elements that nbsp will show, but I don't think I can use innerHTML to get only XXX text (without any text subtraction).

I could just get innerHTML of the whole document and parse it. However, I also need to get the computed style of each element that I'm going to use.

 window.getComputedStyle(theElement).getPropertyValue("text-align");

So, I will go through each node. In addition, innerHTML shows the source as it is, while crawling nodes it "fixes" html errors by adding end tags, etc. This is a good thing and something that I would like to preserve.

+6

javascript dom

user984003 Mar 08 '12 at 14:32

source share

1 answer

bfavaretto · Accepted Answer · 2012-03-08T14:55:54+0000

What if you check charCode? I believe that the regular space is 32 , and   - 160 .

Javascript DOM, get text node without losing distance information

More articles: