开发者

Types that define `__eq__` are unhashable?

开发者 https://www.devze.com 2022-12-09 09:06 出处:网络
I had a strange bug when porting a feature to the Python 3.1 fork of my program. I narrowed it down to the following hypothesis:

I had a strange bug when porting a feature to the Python 3.1 fork of my program. I narrowed it down to the following hypothesis:

In contrast to Python 2.x, in Python 3.x if an object has an __eq__ method it is automatically unhashable.

Is this true?

Here's what happens in Python 3.1:

>>> class O(object):
...     def __eq__(self, other):
...         return 'whatever'
...
>>> o = O()
>>> d = {o: 0}
Traceback (most recent call last):
  File "<pyshell#16>", line 1, in <module>
    d = {o: 0}
TypeError: unhashable type: 'O'

The follow-up question is, how do I solve my personal problem? I have an object ChangeTracker which stores a WeakKeyDictionary that points to several objects, giving for each the value of their pickle dump at a certain time point in the past. Whenever an existing object is checked in, the change tracker says whether its开发者_如何转开发 new pickle is identical to its old one, therefore saying whether the object has changed in the meantime. Problem is, now I can't even check if the given object is in the library, because it makes it raise an exception about the object being unhashable. (Cause it has a __eq__ method.) How can I work around this?


Yes, if you define __eq__, the default __hash__ (namely, hashing the address of the object in memory) goes away. This is important because hashing needs to be consistent with equality: equal objects need to hash the same.

The solution is simple: just define __hash__ along with defining __eq__.


This paragraph from http://docs.python.org/3.1/reference/datamodel.html#object.hash

If a class that overrides __eq__() needs to retain the implementation of __hash__() from a parent class, the interpreter must be told this explicitly by setting __hash__ = <ParentClass>.__hash__. Otherwise the inheritance of __hash__() will be blocked, just as if __hash__ had been explicitly set to None.


Check the Python 3 manual on object.__hash__:

If a class does not define an __eq__() method it should not define a __hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections.

Emphasis is mine.

If you want to be lazy, it sounds like you can just define __hash__(self) to return id(self):

User-defined classes have __eq__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns id(x).


I'm no python expert, but wouldn't it make sense that, when you define a eq-method, you also have to define a hash-method as well (which calculates the hash value for an object) Otherwise, the hashing mechanism wouldn't know if it hit the same object, or a different object with just the same hash-value. Actually, it's the other way around, it'd probably end up computing different hash values for objects considered equal by your __eq__ method.

I have no idea what that hash function is called though, __hash__ perhaps? :)

0

精彩评论

暂无评论...
验证码 换一张
取 消