How to remove duplicate XML nodes using Ruby?

Suppose I have this structure:

<one>
   <two>
     <three>3</three>
   </two>

   <two>
     <three>4</three>
   </two>

   <two>
     <three>3</three>
   </two>
</one>

Is there any way to get this:

<one>
  <two>
    <three>3</three>
  </two>

  <two>
    <three>4</three>
  </two>

</one>

using Ruby libraries? I managed to get this using Nokogiri. From my tests, this works, but maybe there is a different approach, the best.

+3
source share
2 answers

How about doing all of this in two lines?

seen = Hash.new(0)
node.traverse {|n| n.unlink if (seen[n.to_xml] += 1) > 1}

If there is a possibility that the same node appears under two different parents, and you do not want to be considered duplicate, you can change this second line to:

node.traverse {|n| n.unlink if (seen[(n.parent.path rescue "") + n.to_xml] += 1) > 1}
+5
source

Source: https://habr.com/ru/post/1719903/


All Articles