I have a massive python dictionary with over 90,000 entries. For reasons I won't get into, I need to store this dictionary in my database and then at a later point recompile dictionary from the database entries.
I am trying to set up a procedure to verify that my开发者_如何学JAVA storage and recompilation was faithful and that my new dictionary is equivalent to the old one. What is the best methodology for testing this.
There are minor differences and I want to figure out what they are.
The most obvious approach is of course:
if oldDict != newDict:
print "**Failure to rebuild, new dictionary is different from the old"
That ought to be the fastest possible, since it relies on Python's internals to do the comparison.
UPDATE: It seems you're not after "equal", but something weaker. I think you need to edit your question to make it clear what you consider "equivalent" to mean.
You could start with something like this and tweak it to suit your needs
>>> bigd = dict([(x, random.randint(0, 1024)) for x in xrange(90000)])
>>> bigd2 = dict([(x, random.randint(0, 1024)) for x in xrange(90000)])
>>> dif = set(bigd.items()) - set(bigd2.items())
>>> d1 = {'a':1,'b':2,'c':3}
>>> d2 = {'b':2,'x':2,'a':5}
>>> set(d1.iteritems()) - set(d2.iteritems()) # items in d1 not in d2
set([('a', 1), ('c', 3)])
>>> set(d2.iteritems()) - set(d1.iteritems()) # items in d2 not in d1
set([('x', 2), ('a', 5)])
Edit Don't vote for this answer. Go to Fast comparison between two Python dictionary and add an upvote. It is a very complete solution.
精彩评论