I had a strange bug when porting a feature to the Python 3.1 fork of my program. I narrowed it down to the following hypothesis:
In contrast to Python 2.x, in Python 3.x if an object has an __eq__
method it is automatically unhashable.
Is this true?
Here's what happens in Python 3.1:
>>> class O(object):
... def __eq__(self, other):
... return 'whatever'
...
>>> o = O()
>>> d = {o: 0}
Traceback (most recent call last):
File "<pyshell#16>", line 1, in <module>
d = {o: 0}
TypeError: unhashable type: 'O'
The follow-up question is, how do I solve my personal problem? I have an object ChangeTracker
which stores a WeakKeyDictionary
that points to several objects, giving for each the value of their pickle dump at a certain time point in the past. Whenever an existing object is checked in, the change tracker says whether its开发者_如何转开发 new pickle is identical to its old one, therefore saying whether the object has changed in the meantime. Problem is, now I can't even check if the given object is in the library, because it makes it raise an exception about the object being unhashable. (Cause it has a __eq__
method.) How can I work around this?
Yes, if you define __eq__
, the default __hash__
(namely, hashing the address of the object in memory) goes away. This is important because hashing needs to be consistent with equality: equal objects need to hash the same.
The solution is simple: just define __hash__
along with defining __eq__
.
This paragraph from http://docs.python.org/3.1/reference/datamodel.html#object.hash
If a class that overrides
__eq__()
needs to retain the implementation of__hash__()
from a parent class, the interpreter must be told this explicitly by setting__hash__ = <ParentClass>.__hash__
. Otherwise the inheritance of__hash__()
will be blocked, just as if__hash__
had been explicitly set to None.
Check the Python 3 manual on object.__hash__
:
If a class does not define an
__eq__()
method it should not define a__hash__()
operation either; if it defines__eq__()
but not__hash__()
, its instances will not be usable as items in hashable collections.
Emphasis is mine.
If you want to be lazy, it sounds like you can just define __hash__(self)
to return id(self)
:
User-defined classes have
__eq__()
and__hash__()
methods by default; with them, all objects compare unequal (except with themselves) andx.__hash__()
returnsid(x)
.
I'm no python expert, but wouldn't it make sense that, when you define a eq-method, you also have to define a hash-method as well (which calculates the hash value for an object) Otherwise, the hashing mechanism wouldn't know if it hit the same object, or a different object with just the same hash-value. Actually, it's the other way around, it'd probably end up computing different hash values for objects considered equal by your __eq__
method.
I have no idea what that hash function is called though, __hash__
perhaps? :)
精彩评论