The XSLT 2.0 method is used here.
Assuming that it $docscontains the sequence of nodes in the document that you want to scan, you want to create one line for each element that appears in the documents. You can use <xsl:for-each-group>for this:
<xsl:for-each-group select="$docs//*" group-by="name()">
<xsl:sort select="current-group-key()" />
<xsl:variable name="name" as="xs:string" select="current-grouping-key()" />
<xsl:value-of select="$name" />
...
</xsl:for-each-group>
Then you want to find out statistics for this element among the documents. First, find documents have an element of this name in them:
<xsl:variable name="docs-with" as="document-node()+"
select="$docs[//*[name() = $name]" />
Secondly, you need a sequence of the number of elements of this name in each of the documents:
<xsl:variable name="elem-counts" as="xs:integer+"
select="$docs-with/count(//*[name() = $name])" />
. , avg(), min() max(). - , , , .
:
<xsl:for-each-group select="$docs//*" group-by="name()">
<xsl:sort select="current-group-key()" />
<xsl:variable name="name" as="xs:string" select="current-grouping-key()" />
<xsl:variable name="docs-with" as="document-node()+"
select="$docs[//*[name() = $name]" />
<xsl:variable name="elem-counts" as="xs:integer+"
select="$docs-with/count(//*[name() = $name])" />
<xsl:value-of select="$name" />
<xsl:text>* </xsl:text>
<xsl:value-of select="format-number(avg($elem-counts), '#,##0.0')" />
<xsl:text> </xsl:text>
<xsl:value-of select="format-number(min($elem-counts), '#,##0')" />
<xsl:text> </xsl:text>
<xsl:value-of select="format-number(max($elem-counts), '#,##0')" />
<xsl:text> </xsl:text>
<xsl:value-of select="format-number((count($docs-with) div count($docs)) * 100, '#0')" />
<xsl:text>%</xsl:text>
<xsl:text>
</xsl:text>
</xsl:for-each-group>
, , . , . : -, (, ), - , , , , . -, ( , , ).
, , .
UPDATE:
XSLT XSLT? . Saxon 9B.
, . Saxon ( ) , URI . , , .
XSLT:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:param name="dir" as="xs:string"
select="'file:///path/to/default/directory?select=*.xml'" />
<xsl:output method="text" />
<xsl:variable name="docs" as="document-node()*"
select="collection($dir)" />
<xsl:template name="main">
<xsl:for-each-group select="$docs//*" group-by="name()">
<xsl:sort select="current-group-key()" />
<xsl:variable name="name" as="xs:string" select="current-grouping-key()" />
<xsl:variable name="docs-with" as="document-node()+"
select="$docs[//*[name() = $name]" />
<xsl:variable name="elem-counts" as="xs:integer+"
select="$docs-with/count(//*[name() = $name])" />
<xsl:value-of select="$name" />
<xsl:text>* </xsl:text>
<xsl:value-of select="format-number(avg($elem-counts), '#,##0.0')" />
<xsl:text> </xsl:text>
<xsl:value-of select="format-number(min($elem-counts), '#,##0')" />
<xsl:text> </xsl:text>
<xsl:value-of select="format-number(max($elem-counts), '#,##0')" />
<xsl:text> </xsl:text>
<xsl:value-of select="format-number((count($docs-with) div count($docs)) * 100, '#0')" />
<xsl:text>%</xsl:text>
<xsl:text>
</xsl:text>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
, - :
> java -jar path/to/saxon.jar -it:main -o:report.txt dir=file:///path/to/your/directory?select=*.xml
main, dir file:///path/to/your/directory?select=*.xml report.txt.