The fastest way to find common items between two dictionary lists in Python

Question

The fastest way to find common items between two dictionary lists in Python

I have 2 list of dictionaries.

list1 = [{'user_id':23, 'user_name':'John', 'age':30},
         {'user_id':24, 'user_name':'Shaun', 'age':31},
         {'user_id':25, 'user_name':'Johny', 'age':32}]

list2 =[{'user_id':23},
        {'user_id':25}]

Now i want a way out

list3 = [{'user_id':23, 'user_name':'John', 'age':30},
         {'user_id':25, 'user_name':'Johny','age':32}]

I want the most efficient way, because mine list1can contain millions of lines.

+4

python

curiousguy Jul 10 '17 at 13:21

source share

4 answers

Jean-François Fabre · Answer 1 · 2017-07-10T13:24:45+0000

you need to change a little list2to get a quick search. I would make setof him

list1 = [{'user_id':23, 'user_name':'John','age':30},
         {'user_id':24, 'user_name':'Shaun','age':31},
         {'user_id':25, 'user_name':'Johny','age':32}]

list2 =[{'user_id':23},
        {'user_id':25}]

list2_ids = {d['user_id'] for d in list2}

then build list3using list filtering. In this case, it in list2_idsis very fast because it uses a search from set, rather than a linear search:

list3 = [x for x in list1 if x['user_id'] in list2_ids]

print(list3)

result:

[{'user_id': 23, 'user_name': 'John', 'age': 30}, {'user_id': 25, 'user_name': 'Johny', 'age': 32}]

omri_saadon · Answer 2 · 2017-07-10T13:32:55+0000

list1 , user_id, - name age.

, dict, dict , O(1), .

O(len(list2))

dict1 = {23 : {'user_name':'John', 'age':30},
         24 : {'user_name':'Shaun', 'age':31},
         25 : {'user_name':'Johny', 'age':32}}

list2 =[{'user_id':23},
        {'user_id':25}]

res = [dict1.get(user['user_id']) for user in list2 if user['user_id'] in dict1]

print (res)

>>> [{'user_name': 'John', 'age': 30}, {'user_name': 'Johny', 'age': 32}]

galaxyan · Answer 3 · 2017-07-10T13:42:05+0000

pandas .
1. dict dataframe
2. "user_id"

import pandas as pd
list1 = [{'user_id':23, 'user_name':'John', 'age':30},
          {'user_id':24, 'user_name':'Shaun', 'age':31},
          {'user_id':25, 'user_name':'Johny', 'age':32}] 
list2 =[{'user_id':23},
         {'user_id':25}] 
df1 = pd.DataFrame(list1)
df1
   age  user_id user_name
0   30       23      John
1   31       24     Shaun
2   32       25     Johny
df2 = pd.DataFrame(list2)
df2
   user_id
0       23
1       25

pd.merge(df2,df1,on='user_id')
   user_id  age user_name
0       23   30      John
1       25   32     Johny

Djib2011 · Answer 4 · 2017-07-10T13:48:14+0000

, 2:

list2_ids = {d['user_id'] for d in list2}

:

filter(lambda x: x['user_id'] in list2_ids, list1)

This, although not optimized, has the advantage of having several implementations for parallel computing (which you might need if you are dealing with a lot of data.

So the best solution is probably a lot of intersections ( comparison ):

unique_ids = set([d['user_id'] for d in list1]) & set([d['user_id'] for d in list2])
list3 = [x for x in list1 if x['user_id'] in unique_ids]

If you are sure that the lists do not contain duplicates, you can ignore the set.

The fastest way to find common items between two dictionary lists in Python

More articles: