XQuery / XPath: using the count () and max () function to return the item with the highest counter

I have an XML file containing authors and editors.

<?xml version="1.0" encoding="UTF-8"?> <?oxygen RNGSchema="file:textbook.rnc" type="compact"?> <books xmlns="books"> <book ISBN="i0321165810" publishername="OReilly"> <title>XPath</title> <author> <name> <fname>Priscilla</fname> <lname>Walmsley</lname> </name> </author> <year>2007</year> <field>Databases</field> </book> <book ISBN="i0321165812" publishername="OReilly"> <title>XQuery</title> <author> <name> <fname>Priscilla</fname> <lname>Walmsley</lname> </name> </author> <editor> <name> <fname>Lisa</fname> <lname>Williams</lname> </name> </editor> <year>2003</year> <field>Databases</field> </book> <publisher publishername="OReilly"> <web-site>www.oreilly.com</web-site> <address> <street_address>hill park</street_address> <zip>90210</zip> <state>california</state> </address> <phone>400400400</phone> <e-mail> oreilly@oreilly.com </e-mail> <contact> <field>Databases</field> <name> <fname>Anna</fname> <lname>Smith</lname> </name> </contact> </publisher> </books> 

I’m looking for a way to return the person who was listed most often as the author and / or editor. The solution must be compatible with XQuery 1.0 (XPath 2.0).

I thought about using a FLWOR query to iterate over all authors and editors, then counting unique authors / editors, and then returning to the author (s) / editors (editors) that match the highest score. But I could not find the right solution.

Does anyone have any suggestions on how such a FLWOR request would be written? Can this be made easier using XPath?

Respectfully,

Ginette

+6
source share
4 answers

This can help:

 declare default element namespace 'books'; (for $name in distinct-values($doc/books/*/*/name) let $entries := $doc/books/*[data(*/name) = $name] order by count($entries) descending return $entries/*/name)[1] 
+15
source

Here's a clean XPath 2.0 expression, admittedly not for the faint of heart :

 (for $m in max(for $n in distinct-values(/*/b:book/(b:author | b:editor) /b:name/concat(b:fname, '|', b:lname)), $cnt in count(/*/b:book/(b:author | b:editor) /b:name[$n eq concat(b:fname, '|', b:lname) ]) return $cnt ), $name in /*/b:book/(b:author | b:editor)/b:name, $fullName in $name/concat(b:fname, '|', b:lname), $count in count( /*/b:book/(b:author | b:editor) /b:name[$fullName eq concat(b:fname, '|', b:lname)]) return if($count eq $m) then $name else () )[1] 

where the prefix "b:" is associated with the namespace "books" .

XSLT 2.0 based validation :

 <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b="books"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="/"> <xsl:sequence select= "(for $m in max(for $n in distinct-values(/*/b:book/(b:author | b:editor) /b:name/concat(b:fname, '|', b:lname)), $cnt in count(/*/b:book/(b:author | b:editor) /b:name[$n eq concat(b:fname, '|', b:lname) ]) return $cnt ), $name in /*/b:book/(b:author | b:editor)/b:name, $fullName in $name/concat(b:fname, '|', b:lname), $count in count( /*/b:book/(b:author | b:editor) /b:name[$fullName eq concat(b:fname, '|', b:lname)]) return if($count eq $m) then $name else () )[1] "/> </xsl:template> </xsl:stylesheet> 

when this conversion is applied to the provided XML document :

 <books xmlns="books"> <book ISBN="i0321165810" publishername="OReilly"> <title>XPath</title> <author> <name> <fname>Priscilla</fname> <lname>Walmsley</lname> </name> </author> <year>2007</year> <field>Databases</field> </book> <book ISBN="i0321165812" publishername="OReilly"> <title>XQuery</title> <author> <name> <fname>Priscilla</fname> <lname>Walmsley</lname> </name> </author> <editor> <name> <fname>Lisa</fname> <lname>Williams</lname> </name> </editor> <year>2003</year> <field>Databases</field> </book> <publisher publishername="OReilly"> <web-site>www.oreilly.com</web-site> <address> <street_address>hill park</street_address> <zip>90210</zip> <state>california</state> </address> <phone>400400400</phone> <e-mail> oreilly@oreilly.com </e-mail> <contact> <field>Databases</field> <name> <fname>Anna</fname> <lname>Smith</lname> </name> </contact> </publisher> </books> 

the desired, correct name element is selected and displayed :

 <name xmlns="books"> <fname>Priscilla</fname> <lname>Walmsley</lname> </name> 
+7
source

I always thought this was an exception in XPath: the max () and min () functions return the highest / lowest value, whereas what you usually want is the object in the collection that has the highest / lowest value for some expression. One solution is to sort the objects by this value and get the first / last from the list, which seems inelegant. Calculating min / max and then selecting elements whose value matches this seems equally unattractive. Saxon already has a pair of higher extension functions saxon: upper () and saxon: lower (), which take a sequence and function and return elements from a sequence that has the lowest or highest values ​​of the result of the function. The good news is that in XPath 3.0 you can write these functions yourself (they are actually given as an example of user-defined functions in the specification).

+4
source

You are on the right track. The easiest way is to convert the names to strings (for example, separated by a space) and use them: (Note that the following code has not been verified)

 let $names := (//editor | //author)/concat(fname, ' ', lname) let $distinct-names := distinct-values($names) let $name-count := for $name in $distinct-names return count($names[. = $name]) for $name at $pos in $distinct-names where $name-count[$pos] = max($name-count) return $name 

Or another approach:

 ( let $people := (//editor | //author) for $person in $people order by count($people[fname = $person/fname and lname = $person/lname]) return $person )[last()] 
+2
source

Source: https://habr.com/ru/post/902802/


All Articles