I have xml documents (similar to a docbook) that need to be converted to xsl-fo. Some of the documents contain verses, and poetry verses are written in separate tags. Verses are separated by br. There are page tags that are irrelevant and should be ignored.
Example of a typical code:
<h4>Headline</h4> <p>1st line of 1st verse</p> <p>2nd line of 1st verse</p> <br/> <p>1st line of 2nd verse</p> <p>2nd line of 2nd verse</p> <page n="100"/> <p>3rd line of 2nd verse</p> <h4>Other headline</h4>
For xsl-fo output, I would like to collect the entire text of the verse into one fo: block. Currently, the mechanism works for code structures, as described above, but there are some exceptions. The actual way to do this is to solve for each p-tag: - Am I the first line of the verse? - If yes: collect the entire text of this verse and write it in fo: block, use the attributes of the actual (first) p-tag to set the formatting of the block - If not: the content is carbonaceous, do nothing.
The first line is the p tag immediately preceded by the h4 or br tag (or the page tag immediately preceded by the br tag). It was easy to develop.
Gathering the text of the verse was simple for this example: Group all the following siblings by defining the ends of the groups with h4 or br tags, then I take the first group and use all p tags (ignore between the page tags or the end of the h4 or br tag).
In code:
<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4"> <xsl:if test="position()=1"> <xsl:for-each select="current-group()[not(self::h4) and not(self::br) and not(self::page)]"> <xsl:apply-templates/>&crt; </xsl:for-each> </xsl:if> </xsl:for-each-group>
Now to a similar code example:
<h4>Headline</h4> <p class="center">1</p> <p>1st line of 1st verse</p> <p>2nd line of 1st verse</p> <br/> <p class="center">2</p> <p>1st line of 2nd verse</p> <p>2nd line of 2nd verse</p> <page n="100"/> <p>3rd line of 2nd verse</p> <h4>Other headline</h4>
Now centered p is similar to the subtitles of the following verses. Actually this is not a verse, but for my purposes this would be enough if it were separated from the real text of the verse. Thus, a slightly modified rule for obtaining the entire text of the current verse: Group all the following siblings by determining the ends of the groups by h4 or br tags or by the ap tag, which has a different class, and then the current p tag , then I take the first group and use all p tags (ignore page tags or h4 or br tag between them).
Therefore, I saved the class attribute value of the current p tag in a variable called attributes, and defined the group rule as:
<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4|p[normalize-space(@class) != $attributes]">
In eturn, when trying to determine whether the p tag is the first line of a verse, it can be preceded not only by h4 or br, but also by another p-tag that has a different class attribute value.
Now it works fine in my test environment in Oxygen using Saxon-B9.1.0.6. But the conversion must be done in java using Saxon9.jar, and there using the variable inside the group-end-with attribute from xsl: for-each-group raises an exception.
And now I'm kinda stuck.
Is it better to define grouping conditions? Or maybe this cannot be done with the grouping at all, but with a completely different approach?
Source files such as they are, tagging may not be optimal, but it is. The transformation is not new, but subsequently adapted to our needs. The source code with verses in it just avoided before, but I would like to find a solution for this.
Any help would be greatly appreciated.
Yours faithfully,
Christian Kirchhoff