XSLT - How to Save Only Required Elements from XML

I have several XML files that contain a lot of overhead. I want to save only about 20 specific elements and filter out something else. I know all the names of the elements that I want to keep, I also know whether they are children and who their parents are. These elements that I want to preserve after the conversion must still have their original hierarchical location.

eg. I want to save ONLY

<ns:currency>

in;

 <ns:stuff> <ns:things> <ns:currency>somecurrency</ns:currency> <ns:currency_code/> <ns:currency_code2/> <ns:currency_code3/> <ns:currency_code4/> </ns:things> </ns:stuff> 

And do it like this:

 <ns:stuff> <ns:things> <ns:currency>somecurrency</ns:currency> </ns:things> </ns:stuff> 

What would be the best way to build XSLT for this?

+6
source share
2 answers

This is a general conversion :

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ns="some:ns"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <ns:WhiteList> <name>ns:currency</name> <name>ns:currency_code3</name> </ns:WhiteList> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match= "*[not(descendant-or-self::*[name()=document('')/*/ns:WhiteList/*])]"/> </xsl:stylesheet> 

when applied to the provided XML document (with the addition of a namespace definition to make it valid):

 <ns:stuff xmlns:ns="some:ns"> <ns:things> <ns:currency>somecurrency</ns:currency> <ns:currency_code/> <ns:currency_code2/> <ns:currency_code3/> <ns:currency_code4/> </ns:things> </ns:stuff> 

creates the desired result (elements with a white element and their structural relationships are preserved):

 <ns:stuff xmlns:ns="some:ns"> <ns:things> <ns:currency>somecurrency</ns:currency> <ns:currency_code3/> </ns:things> </ns:stuff> 

Explanation

  • An identity rule / template copies all nodes as-is.

  • The style sheet contains a top-level <ns:WhiteList> element, whose child <name> elements specify all the names of the elements listed in the white list - elements that must be saved with their structural relationships in the document.

  • The <ns:WhiteList> best stored in a separate document, so the current style sheet does not need to be edited with new names. Here, the whitelist is in the same stylesheet for convenience only.

  • One template replaces the identity template. It does not process (delete) any element that is not white and has no descendant that is white.

+12
source

In XSLT, you usually don’t delete the elements you want to delete, but you copy the elements you want to keep:

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ns="http://www.example.com/ns#" version="1.0"> <xsl:output method="xml" indent="yes" omit-xml-declaration="no"/> <xsl:template match="/ns:stuff"> <xsl:copy> <xsl:apply-templates select="ns:things"/> </xsl:copy> </xsl:template> <xsl:template match="ns:things"> <xsl:copy> <xsl:apply-templates select="ns:currency"/> <xsl:apply-templates select="ns:currency_code3"/> </xsl:copy> </xsl:template> <xsl:template match="ns:currency"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="ns:currency_code3"> <xsl:copy-of select="."/> </xsl:template> </xsl:stylesheet> 

The example above copies only currency and currency_code3 . The output is as follows:

 <?xml version="1.0" encoding="UTF-8"?> <ns:stuff xmlns:ns="http://www.example.com/ns#"> <ns:things> <ns:currency>somecurrency</ns:currency> <ns:currency_code3/> </ns:things> </ns:stuff> 

Note. I have added a namespace declaration for your ns prefix.

If you want to copy everything except a few elements, you can see this answer

+4
source

Source: https://habr.com/ru/post/886698/


All Articles