How to get rows of a nested HTML table using XSLT

I am trying to get table rows from XHTML using XPath / XSLT. My xhtml example looks like this:

<body> <....> <table> <tbody> <tr> <td/> <td/> <td> <table> <tr> <....> </tr> </table> </td> </tr> </tbody> </table> </body> 

In the above structure, <tbody> may or may not be. Tables can be nested at any level. Now I want to get all the rows for this table. Therefore, when I process the outer table, I want to get only the outer row (containing 3 tds), but not the inner tr (inside the nested table). How can I do this using XSLT or XPath?

Edit: what I'm looking for is a way to get all descendants :: y for node x, but y should not be a descendant of another x. The path from x-> y must not contain another x. I may not have anything that distinguishes outer x from inner x.

Note. I try to do this with many HTML files that have different structures, and I cannot change the structure of any HTML file - I have been given this. The only thing is that they all formed XHTML well.

Thank you for your help.

+4
source share
2 answers

What I'm mostly looking for is a way to get all descendant::y for node x , but y should not be a descendant of another x .

Suppose $ n is an element named x . Do you want :

 $n//y[count(ancestor::x) = count($n/ancestor-or-self::x)] 

This selects all y that are descendants of $ n and have the number of ancestors x that is one times the number of ancestors :: x from $ n.

Since $n contains the element x , this means that for all the selected y x contained in $n , is their first ancestor::x .

For your practical purposes, you only need to substitute $n above with the exact XPath expression that selects the x element that contains.

0
source

The following expression selects the tr elements of any table element that does not have table as the ancestor (that is, only the outermost tables) and may or may not have a tbody element:

 //table[not(ancestor::table)]/tbody/tr|//table[not(ancestor::table)]/tr 

This is the union of two separate expressions that selects the correct element when tbody present, and the other when it is not.

+2
source

Source: https://habr.com/ru/post/1392518/


All Articles