Scala: What is the easiest way to get all leaf nodes and their paths in XML?

I am currently implementing a DFS bypass for xml in such a way that it falls into each node sheet and generates the path to the node sheet.

Given XML:

<vehicles> <vehicle> gg </vehicle> <variable> </variable> </vehicles> 

Output (somthing like):

 Map("gg" -> "vehicles/vehicle", "" -> "vehicles/variable") 

It would be great if a library was available that does this, so I don't need to maintain code.

Thank you Any help is appreciated.

+2
source share
2 answers

Here is a solution using the standard scala xml library that displays a path map -> "node text"

 import scala.xml._ val x = <div class="content"><a></a><p><q>hello</q></p><r><p>world</p></r><s></s></div> var map = Map[String,String]() def dfs(n: Seq[Node], brc: String): Unit = n.foreach(x => { if(x.child.isEmpty){ if(x.text == ""){ map = map + (brc + x.label -> "") dfs(x.child,brc) } else{ map = map + (brc + x.label + " " -> x.text) dfs(x.child,brc) } } else{ val bc = brc + x.label + ">" dfs(x.child,bc) } } ) dfs(x,"") print(map) 
0
source

For those who just need a function to use or a more convenient XPath solution

I created a repository that extends the code below and should generate the correct XPaths, but I leave the code below because it is relatively simple and is a good place to start understanding the code. The repository is on github .

Answer

Here's an implementation based on @Samar's answer, which generates XPaths (still without proper attribute notation), is tail recursive, processes attributes, and doesn't use a mutable collection:

  /** * Helper function to add XPaths to a node sequence; assume a default of root nodes. */ def pathifyNodes(nodes: Seq[Node], parPath: String = "/"): Seq[(Node, String)] = nodes.map{nn => (nn, parPath + nn.label + "/")} @tailrec final def uniqueXpaths( nodes: Seq[(Node, String)], pathData: List[(String, String)] = Nil ): List[(String, String)] = nodes match { case (node, currentPath) +: rest => val newElementData = if(node.child.isEmpty) List((currentPath, node.text)) else Nil val newAttributeData = node.attributes.asAttrMap.map{ case (key, value) => (currentPath + "@" + key, value) }.toList uniqueXpaths( rest ++ node.child.flatMap(ns => pathifyNodes(ns, currentPath)), newElementData ::: newAttributeData ::: pathData ) case Seq() => pathData } 

Follow these steps:

  val x = <div class="content"><a></a><p><q>hello</q></p><r><p>world</p></r><s></s></div> val xpaOut = uniqueXpaths(pathifyNodes(x)) 

Suggestions are welcome. I plan to fix attribute handling to create the right XPaths, which depend on the choice of attributes, and may also try to handle recursive XPaths in some reasonable way, but I suspect this will significantly increase the size of the code, so I would like to continue and paste it now.

+2
source

Source: https://habr.com/ru/post/894987/


All Articles