How to recursively remove some xml elements using XSLT

So, I got this situation that sucks. I have such XML

<table border="1" cols="200 100pt 200"> <tr> <td>isbn</td> <td>title</td> <td>price</td> </tr> <tr> <td /> <td /> <td> <span type="champsimple" id="9b297fb5-d12b-46b1-8899-487a2df0104e" categorieid="a1c70692-0427-425b-983c-1a08b6585364" champcoderef="01f12b93-b4c5-401b-9da1-c9385d77e43f"> [prénom] </span> <span type="champsimple" id="e103a6a5-d1be-4c34-8a54-d234179fb4ea" categorieid="a1c70692-0427-425b-983c-1a08b6585364" champcoderef="01f12b93-b4c5-401b-9da1-c9385d77e43f">[nom]</span> <span></span> </td> </tr> <tr></tr> <tr> <td></td> <td>Phill It in</td> </tr> <tr> <table id="cas1"> <tr> <td ></td> <td >foo</td> </tr> <tr> <td >bar</td> <td >boo</td> </tr> </table> </tr> <tr> <table id="cas2"> <tr> <td ></td> <td >foo</td> </tr> <tr> <td ></td> <td >boo</td> </tr> </table> </tr> <tr> <table id="cas3"> <tr> <td >bar</td> <td ></td> </tr> <tr> <td >foo</td> <td >boo</td> </tr> </table> </tr> <tr> <table id="cas4"> <tr> <td /> <td /> </tr> <tr> <td>foo</td> <td>boo</td> </tr> </table> </tr> <table id="cas4"> <tr> <td /> <td /> </tr> <tr> <td>foo</td> <td>boo</td> </tr> </table> <tr> <td /> <td /> </tr> </table> 

Now the question is, how would I recursively delete all empty td, tr and table elements?

Now i am using this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*" /> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="td[not(node())]" /> <xsl:template match="tr[not(node())]" /> <xsl:template match="table[not(node())]" /> </xsl:stylesheet>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*" /> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="td[not(node())]" /> <xsl:template match="tr[not(node())]" /> <xsl:template match="table[not(node())]" /> </xsl:stylesheet> 

But this is not very good. After deletion, td tr becomes empty, but it does not handle this. Very sorry. See table element with "cas4".

<table border="1" cols="200 100pt 200"> <tr> <td>isbn</td> <td>title</td> <td>price</td> </tr> <tr> <td> <span type="champsimple" id="9b297fb5-d12b-46b1-8899-487a2df0104e" categorieid="a1c70692-0427-425b-983c-1a08b6585364" champcoderef="01f12b93-b4c5-401b-9da1-c9385d77e43f"> [prénom] </span> <span type="champsimple" id="e103a6a5-d1be-4c34-8a54-d234179fb4ea" categorieid="a1c70692-0427-425b-983c-1a08b6585364" champcoderef="01f12b93-b4c5-401b-9da1-c9385d77e43f">[nom]</span> <span /> </td> </tr> <tr> <td>Phill It in</td> </tr> <tr> <table id="cas1"> <tr> <td>foo</td> </tr> <tr> <td>bar</td> <td>boo</td> </tr> </table> </tr> <tr> <table id="cas2"> <tr> <td>foo</td> </tr> <tr> <td>boo</td> </tr> </table> </tr> <tr> <table id="cas3"> <tr> <td>bar</td> </tr> <tr> <td>foo</td> <td>boo</td> </tr> </table> </tr> <tr> <table id="cas4"> <tr /> <tr> <td>foo</td> <td>boo</td> </tr> </table> </tr> <table id="cas4"> <tr /> <tr> <td>foo</td> <td>boo</td> </tr> </table> <tr /> </table>

<table border="1" cols="200 100pt 200"> <tr> <td>isbn</td> <td>title</td> <td>price</td> </tr> <tr> <td> <span type="champsimple" id="9b297fb5-d12b-46b1-8899-487a2df0104e" categorieid="a1c70692-0427-425b-983c-1a08b6585364" champcoderef="01f12b93-b4c5-401b-9da1-c9385d77e43f"> [prénom] </span> <span type="champsimple" id="e103a6a5-d1be-4c34-8a54-d234179fb4ea" categorieid="a1c70692-0427-425b-983c-1a08b6585364" champcoderef="01f12b93-b4c5-401b-9da1-c9385d77e43f">[nom]</span> <span /> </td> </tr> <tr> <td>Phill It in</td> </tr> <tr> <table id="cas1"> <tr> <td>foo</td> </tr> <tr> <td>bar</td> <td>boo</td> </tr> </table> </tr> <tr> <table id="cas2"> <tr> <td>foo</td> </tr> <tr> <td>boo</td> </tr> </table> </tr> <tr> <table id="cas3"> <tr> <td>bar</td> </tr> <tr> <td>foo</td> <td>boo</td> </tr> </table> </tr> <tr> <table id="cas4"> <tr /> <tr> <td>foo</td> <td>boo</td> </tr> </table> </tr> <table id="cas4"> <tr /> <tr> <td>foo</td> <td>boo</td> </tr> </table> <tr /> </table> 

How would you solve this problem?

+4
source share
3 answers

There is your solution:

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/> <xsl:template match="node()"> <xsl:copy> <xsl:apply-templates select="@* | node()" /> </xsl:copy> </xsl:template> <xsl:template match="@* | text()"> <xsl:copy /> </xsl:template> <xsl:template match="table | tr | td"> <!-- result of the transformation of descendants --> <xsl:variable name="content"> <xsl:apply-templates select="node()" /> </xsl:variable> <!-- if there are any children left then copy myself --> <xsl:if test="count($content/node()) > 0"> <xsl:copy> <xsl:apply-templates select="@*" /> <xsl:copy-of select="$content" /> </xsl:copy> </xsl:if> </xsl:template> </xsl:stylesheet> 

The idea is simple. First, I will make a transformation for my descendants, and then see if anyone remains. If so, I will copy myself and the result of the conversion.

If you want to keep the table structure and delete only empty rows - <tr> elements containing only empty <td> elements, than just create a similar template for <tr> with a different condition and ignore the <td> elements.

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/> <xsl:template match="node()"> <xsl:copy> <xsl:apply-templates select="@* | node()" /> </xsl:copy> </xsl:template> <xsl:template match="@* | text()"> <xsl:copy /> </xsl:template> <xsl:template match="table"> <!-- result of the transformation of descendants --> <xsl:variable name="content"> <xsl:apply-templates select="node()" /> </xsl:variable> <!-- if there are any children left then copy myself --> <xsl:if test="count($content/node()) > 0"> <xsl:copy> <xsl:apply-templates select="@*" /> <xsl:copy-of select="$content" /> </xsl:copy> </xsl:if> </xsl:template> <xsl:template match="tr"> <!-- result of the transformation of descendants --> <xsl:variable name="content"> <xsl:apply-templates select="node()" /> </xsl:variable> <!-- number of non-empty td elements --> <xsl:variable name="cellCount"> <xsl:value-of select="count($content/td[node()])" /> </xsl:variable> <!-- number of other elements --> <xsl:variable name="elementCount"> <xsl:value-of select="count($content/node()[name() != 'td'])" /> </xsl:variable> <xsl:if test="$cellCount > 0 or $elementCount > 0"> <xsl:copy> <xsl:apply-templates select="@*" /> <xsl:copy-of select="$content" /> </xsl:copy> </xsl:if> </xsl:template> </xsl:stylesheet> 

Well, actually the last if should be like this:

 <xsl:choose> <!-- if there are cells then copy the content --> <xsl:when test="$cellCount > 0"> <xsl:copy> <xsl:apply-templates select="@*" /> <xsl:copy-of select="$content" /> </xsl:copy> </xsl:when> <!-- if there are only other elements copy them --> <xsl:when test="$elementCount > 0"> <xsl:copy> <xsl:apply-templates select="@*" /> <xsl:copy-of select="$content/node()[name() != 'td']" /> </xsl:copy> </xsl:when> </xsl:choose> 

This is because the <tr> contains empty <td> elements and other elements. Then you want to remove the <td> and leave only the rest.

+1
source

It appears that your definition of empty means "contains no text or only spaces." This is true? If so, the following conversion should do the trick:

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*" /> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="td[not(normalize-space(.))]" /> <xsl:template match="tr[not(normalize-space(.))]" /> <xsl:template match="table[not(normalize-space(.))]" /> </xsl:stylesheet> 
+4
source

You can also filter out any table containing only <tr> with an empty <td>, and any <tr> with an empty <tr> (in addition to your other filters) using something like this (not tested):

 <xsl:template match="tr[not(td/node())]" /> <xsl:template match="table[not(tr/td/node())]" /> 
+1
source

Source: https://habr.com/ru/post/1306937/


All Articles