XSLT: moving grouping html elements to section levels

I am trying to write an XSLT that organizes an HTML file at different section levels depending on the header level. Here is my input:

<html> <head> <title></title> </head> <body> <h1>HEADER 1 CONTENT</h1> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <h2>Header 2 CONTENT</h2> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> </body> </html> 

I am currently working with a fairly simple structure, so this template will be constant for time. I need a conclusion like this ...

 <document> <section level="1"> <header1>Header 1 CONTENT</header1> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <section level="2"> <header2>Header 2 CONTENT</header2> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> </section> </section> </document> 

I worked with this example: Response to Stokes Stream

However, I cannot get him to do exactly what I need.

I am using Saxon 9 to run xslt in Oxygen for dev. I will use the cmd / bat file in the production process. Still Saxon 9. I would like to process up to 4 levels of nested partitions, if possible.

Any help is much appreciated!

I need to add this as I came across another slander. I probably should have thought of this before.

I meet the following code example

 <html> <head> <title></title> </head> <body> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <h1>Header 2 CONTENT</h1> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> </body> </html> 

As you can see, <p> is a child of <body> , and in my first snippet, <p> always a child of the header level. My desired result is the same as above, except when I encounter <p> as a child of <body> , it should be wrapped in <section level="1"> .

 <document> <section level="1"> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> </section> <section level="1"> <header1>Header 2 CONTENT</header1> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> </section> </document> 
+4
source share
4 answers

Here is the XSLT 2.0 style sheet:

 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mf="http://example.com/mf" exclude-result-prefixes="xs mf" version="2.0"> <xsl:output indent="yes"/> <xsl:function name="mf:group" as="node()*"> <xsl:param name="elements" as="element()*"/> <xsl:param name="level" as="xs:integer"/> <xsl:for-each-group select="$elements" group-starting-with="*[local-name() eq concat('h', $level)]"> <xsl:choose> <xsl:when test="self::*[local-name() eq concat('h', $level)]"> <section level="{$level}"> <xsl:element name="header{$level}"><xsl:apply-templates/></xsl:element> <xsl:sequence select="mf:group(current-group() except ., $level + 1)"/> </section> </xsl:when> <xsl:otherwise> <xsl:apply-templates select="current-group()"/> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:function> <xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@*, node()"/> </xsl:copy> </xsl:template> <xsl:template match="/html"> <document> <xsl:apply-templates select="body"/> </document> </xsl:template> <xsl:template match="body"> <xsl:sequence select="mf:group(*, 1)"/> </xsl:template> </xsl:stylesheet> 

It should do what you requested, although it does not stop at four nested levels, but rather into groups, until it finds the elements h[n] .

+5
source

XSLT 1.0 Solution (essentially occupied by Jenny Tennyson):

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="html"> <document><xsl:apply-templates/></document> </xsl:template> <xsl:template match="body"> <xsl:apply-templates select="h1" /> </xsl:template> <xsl:key name="next-headings" match="h6" use="generate-id(preceding-sibling::*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5][1])" /> <xsl:key name="next-headings" match="h5" use="generate-id(preceding-sibling::*[self::h1 or self::h2 or self::h3 or self::h4][1])" /> <xsl:key name="next-headings" match="h4" use="generate-id(preceding-sibling::*[self::h1 or self::h2 or self::h3][1])" /> <xsl:key name="next-headings" match="h3" use="generate-id(preceding-sibling::*[self::h1 or self::h2][1])" /> <xsl:key name="next-headings" match="h2" use="generate-id(preceding-sibling::h1[1])" /> <xsl:key name="immediate-nodes" match="node()[not(self::h1 | self::h2 | self::h3 | self::h4 | self::h5 | self::h6)]" use="generate-id(preceding-sibling::*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 or self::h6][1])" /> <xsl:template match="h1 | h2 | h3 | h4 | h5 | h6"> <xsl:variable name="vLevel" select="substring-after(name(), 'h')" /> <section level="{$vLevel}"> <xsl:element name="header{$vLevel}"> <xsl:apply-templates /> </xsl:element> <xsl:apply-templates select="key('immediate-nodes', generate-id())" /> <xsl:apply-templates select="key('next-headings', generate-id())" /> </section> </xsl:template> <xsl:template match="/*/*/node()" priority="-20"> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet> 

when this conversion is applied to the following XML document :

 <html> <body> <h1>1</h1> <p>1</p> <h2>1.1</h2> <p>2</p> <h3>1.1.1</h3> <p>3</p> <h2>1.2</h2> <p>4</p> <h1>2</h1> <p>5</p> <h2>2.1</h2> <p>6</p> </body> </html> 

the desired result is obtained :

 <document> <section level="1"> <header1>1</header1> <p>1</p> <section level="2"> <header2>1.1</header2> <p>2</p> <section level="3"> <header3>1.1.1</header3> <p>3</p> </section> </section> <section level="2"> <header2>1.2</header2> <p>4</p> </section> </section> <section level="1"> <header1>2</header1> <p>5</p> <section level="2"> <header2>2.1</header2> <p>6</p> </section> </section> </document> 
+3
source

More general grouping in XSLT 1.0

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:key name="kHeaderByPreceding" match="body/*[starts-with(name(),'h')]" use="generate-id(preceding-sibling::* [starts-with(name(),'h')] [substring(name(current()),2) > substring(name(),2)][1])"/> <xsl:key name="kElementByPreceding" match="body/*[not(starts-with(name(),'h'))]" use="generate-id(preceding-sibling::* [starts-with(name(),'h')][1])"/> <xsl:template match="node()|@*" mode="copy"> <xsl:copy> <xsl:apply-templates select="node()|@*" mode="copy"/> </xsl:copy> </xsl:template> <xsl:template match="body"> <document> <xsl:apply-templates select="key('kHeaderByPreceding','')"/> </document> </xsl:template> <xsl:template match="body/*[starts-with(name(),'h')]"> <section level="{substring(name(),2)}"> <xsl:element name="header{substring(name(),2)}"> <xsl:apply-templates mode="copy"/> </xsl:element> <xsl:apply-templates select="key('kElementByPreceding', generate-id())" mode="copy"/> <xsl:apply-templates select="key('kHeaderByPreceding', generate-id())"/> </section> </xsl:template> <xsl:template match="text()"/> </xsl:stylesheet> 

Output:

 <document> <section level="1"> <header1>HEADER 1 CONTENT</header1> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <p>Level 1 para</p> <section level="2"> <header2>Header 2 CONTENT</header2> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> <p>Level 2 para</p> </section> </section> </document> 

And with a more complex input example, for example:

 <body> <h1>1</h1> <p>1</p> <h2>1.1</h2> <p>2</p> <h3>1.1.1</h3> <p>3</p> <h2>1.2</h2> <p>4</p> <h1>2</h1> <p>5</p> <h2>2.1</h2> <p>6</p> </body> 

Output:

 <document> <section level="1"> <header1>1</header1> <p>1</p> <section level="2"> <header2>1.1</header2> <p>2</p> <section level="3"> <header3>1.1.1</header3> <p>3</p> </section> </section> <section level="2"> <header2>1.2</header2> <p>4</p> </section> </section> <section level="1"> <header1>2</header1> <p>5</p> <section level="2"> <header2>2.1</header2> <p>6</p> </section> </section> </document> 
+2
source

I was able to get something working on my addition above. I added logic to the body template to check for header tags. It may not work for every situation, but for my task this is good.

 <xsl:template match="body"> <xsl:choose> <xsl:when test="descendant::h1"> <xsl:apply-templates/> </xsl:when> <xsl:otherwise> <section level="1"> <item> <block ccm="yes" onbup="no" quickref="no" web="no"> <xsl:apply-templates/> </block> </item> </section> </xsl:otherwise> </xsl:choose> </xsl:template> 
0
source

Source: https://habr.com/ru/post/1333580/


All Articles