Using XPath: Find the last node text of each paragraph under the root of the node

Question

Using XPath: Find the last node text of each paragraph under the root of the node

I want to trim trailing spaces at the end of all XHTML paragraphs. I am using Ruby with the REXML library.

Let's say that in a valid XHTML file there is the following:

<p>hello <span>world</span> a </p>
<p>Hi there </p>
<p>The End </p>

I want to end with this:

<p>hello <span>world</span> a</p>
<p>Hi there</p>
<p>The End</p>

So, I thought I could use XPath to get only the text nodes that I need, and then trim the text that would allow me to get what I want (previous).

I started with the following XPath:

//root/p/child::text()

Of course, the problem is that it returns all text nodes that are children of all p-tags. What is it:

'hello '
' a '
'Hi there '
'The End '

Trying the following XPath gives me the last node text of the last paragraph, and not the last node text of each paragraph, which is a descendant of the root of the node.

//root/p/child::text()[last()]

This only returns: 'The End '

What I would like to get from XPath, therefore:

' a '
'Hi there '
'The End '

XPath? (, , XPath)?

+3

html ruby xhtml xpath rexml

Diego Barros 03 . '08 3:37

2

, , XSL normalize-space(), .

+1

AmbroseChapel 03 . '08 6:27

nickf · Accepted Answer · 2008-11-03T04:07:28+0000

//p/child::text()[last()]

Using XPath: Find the last node text of each paragraph under the root of the node

More articles: