Find Duplicates in Ruby Crunches

I have an array of hashes where I need to find and save matches based on one matching value between hashes.

a = [{:id => 1, :name => "Jim", :email => " jim@jim.jim "}, {:id => 2, :name => "Paul", :email => " paul@paul.paul "}, {:id => 3, :name => "Tom", :email => " tom@tom.tom "}, {:id => 1, :name => "Jim", :email => " jim@jim.jim "}, {:id => 5, :name => "Tom", :email => " tom@tom.tom "}, {:id => 6, :name => "Jim", :email => " jim@jim.jim "}] 

Therefore, I would like to return

 b = [{:id => 1, :name => "Jim", :email => " jim@jim.jim "}, {:id => 3, :name => "Tom", :email => " tom@tom.tom "}, {:id => 5, :name => "Tom", :email => " tom@tom.tom "}, {:id => 6, :name => "Jim", :email => " jim@jim.jim "}] 

Notes. I can sort the data (csv) by :name after the fact, so they should not be well grouped, just accurate. Also, not necessarily two identical, it can be 3 or 10 or more.

In addition, the data is about 22,000 rows.

+6
source share
1 answer

I tested this and it will do exactly what you want:

 b = a.group_by { |h| h[:name] }.values.select { |a| a.size > 1 }.flatten 

However, you can look at some of the intermediate objects created in this calculation and see if they are useful to you.

+14
source

Source: https://habr.com/ru/post/951684/


All Articles