Remove duplicate entries from nested dictionary, if two values are the same, in Python_问答_开发者

Remove duplicate entries from nested dictionary, if two values are the same, in Python

开发者 https://www.devze.com 2023-01-23 21:06 出处：网络

Consider this dictionary format. {1:{\'name\':\'chrome\', \'author\':\'google\', \'url\':\'http://www.google.com/\' },

Consider this dictionary format.

{1:{'name':'chrome', 'author':'google', 'url':'http://www.google.com/' },
 2:{'name':'firefox','author':'mozilla','url':'http://www.mozilla.com/'}}

I want to remove all items which have the same name and author.

I can easily remove duplicate entries based on keys by putting all keys in a set, and maybe expand this to work on a specific value, but this seems like a 开发者_如何学JAVAcostly operation which iterates over a dictionary multiple times. I wouldn't know how to do this with two values in an efficient way. It's a dictionary with thousands of items.

Iterate through the dictionary, keeping track of encountered (name, author) tuples as you go and remove those that you have already encountered:

def remove_duplicates(d):
    encountered_entries = set()
    for key, entry in d.items():
        if (entry['name'], entry['author']) in encountered_entries:
            del d[key]
        else:
            encountered_entries.add((entry['name'], entry['author']))

Let's see if this works...

from itertools import groupby

def entry_key(entry):
    key, value = entry
    return (value['name'], value['author'])

def nub(d):
    items = d.items()
    items.sort(key=entry_key)
    grouped = groupby(items, entry_key)
    return dict([grouper.next() for (key, grouper) in grouped])