Remove duplicate entries from nested dictionary if two values match, in Python

Question

Remove duplicate entries from nested dictionary if two values match, in Python

Consider this dictionary format.

{1:{'name':'chrome', 'author':'google', 'url':'http://www.google.com/' },
 2:{'name':'firefox','author':'mozilla','url':'http://www.mozilla.com/'}}

I want to delete all elements that have the same name and author.

I can easily remove duplicate key-based entries by putting all the keys in a set and possibly expanding them to work with a specific value, but this seems like an expensive operation that is repeated through the dictionary several times. I would not know how to do this with two values in an efficient way. This is a dictionary with thousands of items.

+3

python dictionary python-2.5

user479870 Nov 05 '10 at 10:11

source share

2 answers

, ...

from itertools import groupby

def entry_key(entry):
    key, value = entry
    return (value['name'], value['author'])

def nub(d):
    items = d.items()
    items.sort(key=entry_key)
    grouped = groupby(items, entry_key)
    return dict([grouper.next() for (key, grouper) in grouped])

+1

sykora 05 . '10 10:27

Pär Wieslander · Accepted Answer · 2010-11-05T10:20:01+0000

, (name, author), , , :

def remove_duplicates(d):
    encountered_entries = set()
    for key, entry in d.items():
        if (entry['name'], entry['author']) in encountered_entries:
            del d[key]
        else:
            encountered_entries.add((entry['name'], entry['author']))

Remove duplicate entries from nested dictionary if two values ​​match, in Python

More articles:

Remove duplicate entries from nested dictionary if two values match, in Python