Splitting and smoothing nodes using XSLT

I cannot have nested intervals, so I need to flatten them and combine their class attributes so that I can keep track of which classes are parents.

Here's a simplified entry :

<body> <h1 class="section">Title</h1> <p class="main"> ZZZ <span class="a"> AAA <span class="b"> BBB <span class="c"> CCC <preserveMe> eeee </preserveMe> </span> bbb <preserveMe> eeee </preserveMe> </span> aaa </span> </p> </body> 

Here is the desired conclusion

 <body> <h1 class="section">Title</h1> <p class="main"> ZZZ <span class="a"> AAA </span> <span class="ab"> BBB </span> <span class="abc"> CCC <preserveMe> eeee </preserveMe> </span> <span class="ab"> bbb <preserveMe> eeee </preserveMe> </span> <span class="a"> aaa </span> </p> </body> 

Here is the closest I came (I'm really new to this, so even during that time a lot of time has passed ...)

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="/"> <p> <xsl:apply-templates/> </p> </xsl:template> <xsl:template match="*/span"> <span class='{concat(../../@class,../@class,@class)}'> <xsl:value-of select='.'/> </span> <xsl:apply-templates/> </xsl:template> </xsl:stylesheet> 

You can see the result of my unsuccessful attempt and how far this is from what I really wanted if you ran it yourself. Ideally, I would like the decision to take an arbitrary number of nested levels and also be able to handle interrupted nests (span, span, notSpan, span ...).

edit: I added tags inside the nested structure per request by the commentators below. In addition, I use XSLT v1.0, but if necessary, I could use other versions.

edit 2: I realized that my example was too strong compared to what I really needed to convert. Namely, I cannot lose classes from other tags; only gaps can be combined.

+6
source share
4 answers

As I mentioned in the opening comments, this is far from trivial. Here is another approach you might consider:

XSLT 1.0

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <!-- identity transform --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="p"> <xsl:copy> <xsl:apply-templates select="@*|node()|.//span/text()"/> </xsl:copy> </xsl:template> <xsl:template match="span/text()"> <span> <xsl:attribute name="class"> <xsl:for-each select="ancestor::span"> <xsl:value-of select="@class"/> </xsl:for-each> </xsl:attribute> <xsl:apply-templates select="preceding-sibling::*"/> <xsl:value-of select="." /> <xsl:if test="not(following-sibling::text())"> <xsl:apply-templates select="following-sibling::*"/> </xsl:if> </span> </xsl:template> <xsl:template match="span"/> </xsl:stylesheet> 

This is pretty much what Lingamurthy CS previously suggested, but you will see the difference with the following test input:

XML

 <body> <h1 class="section">Title</h1> <p class="main"> ZZZ <preserveMe>0</preserveMe> <span class="a"> AAA <span class="b"> BBB <span class="c"> CCC <preserveMe>c</preserveMe> </span> bbb <preserveMe>b</preserveMe> </span> aaa </span> <preserveMe>1</preserveMe> </p> </body> 
+3
source

I hope the following stylesheet helps:

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <!-- Identity transform template --> <xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy> </xsl:template> <xsl:template match="p"> <xsl:copy> <xsl:apply-templates select="@* | text() | .//text()[parent::span]"/> </xsl:copy> </xsl:template> <xsl:template match="text()[parent::span]"> <span> <xsl:attribute name="class"> <xsl:call-template name="class-value"/> </xsl:attribute> <xsl:value-of select="."/> <xsl:apply-templates select="following-sibling::node()[1][not(self::text()) and not(self::span)]"/> </span> </xsl:template> <xsl:template name="class-value"> <xsl:for-each select="ancestor::span/@class"> <xsl:value-of select="."/> </xsl:for-each> </xsl:template> </xsl:stylesheet> 
0
source

Here you are .. I did this recursively using a nested range template that takes two parameters, the first is the current span class to combine the classes and the current node range. Then process the nested spaces.

So I just call the template for root spans in our case, span under the tag p .

 <xsl:template match="/"> <hmtl> <body> <p> <xsl:for-each select='.//p/span'> <xsl:call-template name='nested-span'> <xsl:with-param name='cclass' select='./@class'></xsl:with-param> <xsl:with-param name='cspan' select='.'></xsl:with-param> </xsl:call-template> </xsl:for-each> </p> </body> </hmtl> </xsl:template> <xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy> </xsl:template> <xsl:template name="nested-span"> <xsl:param name='cclass'/> <xsl:param name='cspan' as='node()' /> <span> <xsl:attribute name='class' select='$cclass'/> <xsl:value-of select='$cspan/text()[1]' /> <xsl:if test="not(exists(./span))"> <xsl:if test='string-length($cspan/text()[2]) &gt; 0 '> <xsl:value-of select='$cspan/text()[2]' /> </xsl:if> <xsl:apply-templates select="./*[local-name() != 'span']"/> </xsl:if> </span> <xsl:for-each select='$cspan/span'> <xsl:call-template name='nested-span'> <xsl:with-param name='cclass' select='concat($cclass, ./@class)' ></xsl:with-param> <xsl:with-param name='cspan' select='.'></xsl:with-param> </xsl:call-template> </xsl:for-each> <xsl:if test="exists(./span)"> <span> <xsl:attribute name='class' select='$cclass'/> <xsl:if test='string-length($cspan/text()[2]) &gt; 0 '> <xsl:value-of select='$cspan/text()[2]' /> </xsl:if> <xsl:apply-templates select="./*[local-name() != 'span']"/> </span> </xsl:if> </xsl:template> 

And here is the conclusion

 <hmtl> <body> <p> <span class="a"> AAA </span> <span class="ab"> BBB </span> <span class="abc"> CCC <preserveMe> eeee </preserveMe> </span> <span class="ab"> bbb <preserveMe> eeee </preserveMe> </span> <span class="a"> aaa </span> </p> </body> </hmtl> 

Hope this helps

0
source

As you said, XSLT 2.0 might be a viable option, I tried a node grouping based approach:

 <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="* | @* | text()"> <xsl:copy> <xsl:apply-templates select="* | @* | text()"/> </xsl:copy> </xsl:template> <xsl:template match="span"> <xsl:for-each-group select="* | text()" group-adjacent="name() = 'span'"> <xsl:choose> <xsl:when test="current-group()/self::span"> <!-- a group of span elements: nothing to do yet --> <xsl:apply-templates select="current-group()"/> </xsl:when> <xsl:otherwise> <!-- a group of text nodes and no-span elements: create span --> <span class="{string-join((ancestor::span/@class), '')}"> <xsl:apply-templates select="current-group()"/> </span> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:template> </xsl:stylesheet> 

Selection Points:

  • span child elements, both text and other elements, are grouped according to whether they are span or not
  • produces the same conclusion Michael solution
0
source

Source: https://habr.com/ru/post/985391/


All Articles