XPath: select text after a specific tag and before the same next tag

I have html code:

<strong>Term:</strong> Some text<br /> More text<br /> Some more lines of text <strong>Term:</strong> Some text<br /> More text<br /> Some more lines of text <strong>Second term:</strong> Some text<br /> More text<br /> Some more lines of text <strong>Term:</strong> Some text<br /> More text<br /> Some more lines of text 

I need to get text nodes between tags with the text "Term" and before the following tag:

 Some text More text Some more lines of text Some text More text Some more lines of text Some text More text Some more lines of text 

Here the condition can be used: the previous tag must contain the text "Term", but I do not know how to create the xpath selector as follows.

+6
source share
2 answers
 //text()[preceding::*[contains(text(),'Term:')] and following::*[contains(text(),'Term:')]] 

This is the same as empo suggested. However, I am looking for a node containing Term and returning all text nodes between them.

However, this only works fine if you don't have another "Term" set. Let me know if this is the case, because then this Xpath will also return some unwanted values.

Since then you have updated the input. I just added one more condition to the previous Xpath.

 //text()[preceding::*[contains(text(),'Term:')] and following::*[contains(text(),'Term:')] and not(contains(., 'Term:'))] 

The @empo solution also works. But we take into account <strong> . The Xpath I wrote just checks the word "Term:" and prints out all the text nodes between them.

Let me know if this works for you.

Sincerely.

+4
source

Your question is still ambiguous, and your original document is not formed. Check this:

 root/text()[preceding::strong[1][contains(text(),'Term')]] 

Applicable:

 <root> <strong>Term:</strong> Some text<br /> More text<br /> Some more lines of text <strong>Term:</strong> Some text2<br /> More text2<br /> Some more lines of text2 <strong>Second term:</strong> Some text3<br /> More text3<br /> Some more lines of text3 <strong>Term:</strong> Some text4<br /> More text4<br /> Some more lines of text4 </root> 

gives:

 Some text More text Some more lines of text Some text2 More text2 Some more lines of text2 Some text4 More text4 Some more lines of text4 

This XPath selects all text nodes between an element containing a Term: line and an element containing any line:

 //text()[preceding::*[contains(text(),'Term:')] and following::*[text()]] 

Applicable:

 <root> <strong>Term:</strong> Some text<br /> More text<br /> Some more lines of text <strong>Second term:</strong> Some text2<br /> More text2<br /> Some more lines of text2 </root> 

Return:

 Some text More text Some more lines of text 
+2
source

Source: https://habr.com/ru/post/890992/


All Articles