For those who just need a function to use or a more convenient XPath solution
I created a repository that extends the code below and should generate the correct XPaths, but I leave the code below because it is relatively simple and is a good place to start understanding the code. The repository is on github .
Answer
Here's an implementation based on @Samar's answer, which generates XPaths (still without proper attribute notation), is tail recursive, processes attributes, and doesn't use a mutable collection:
def pathifyNodes(nodes: Seq[Node], parPath: String = "/"): Seq[(Node, String)] = nodes.map{nn => (nn, parPath + nn.label + "/")} @tailrec final def uniqueXpaths( nodes: Seq[(Node, String)], pathData: List[(String, String)] = Nil ): List[(String, String)] = nodes match { case (node, currentPath) +: rest => val newElementData = if(node.child.isEmpty) List((currentPath, node.text)) else Nil val newAttributeData = node.attributes.asAttrMap.map{ case (key, value) => (currentPath + "@" + key, value) }.toList uniqueXpaths( rest ++ node.child.flatMap(ns => pathifyNodes(ns, currentPath)), newElementData ::: newAttributeData ::: pathData ) case Seq() => pathData }
Follow these steps:
val x = <div class="content"><a></a><p><q>hello</q></p><r><p>world</p></r><s></s></div> val xpaOut = uniqueXpaths(pathifyNodes(x))
Suggestions are welcome. I plan to fix attribute handling to create the right XPaths, which depend on the choice of attributes, and may also try to handle recursive XPaths in some reasonable way, but I suspect this will significantly increase the size of the code, so I would like to continue and paste it now.
source share