I need to design object that support some sort of uncertainty (or wild characters, if you wish) it its components. The work is done in Python.
Consider the following class
class C():
def __init__(self, p1):
self.p1 = p1
The property p1
can be either "x", "y", "z", but sometimes "x or y", or any other combination.
It is required that if p1
of c1
is 'x' and p1
of c2
is 'x or y', then c1 == c2
will return True
. This is easily achieved by supplying a proper __eq__
function.
However, these objects need also to be stored in sets, therefore I need to supply a __hash__
开发者_JAVA百科 function.
How would you calculate hash function for this case, such if c1 == c2
then hash(c1) == hash(c2)
?
Option 1: hashing the property
Not good Here's why
c1 = C('x')
c2 = C('x or y or z')
c1 == c2 #True
hash(c1) == hash(c2)#False
Your equality criterion is not transitive, and therefore invalid:
C('x') == C('x or y') == C('y')
but
C('x') != C('y')
Since you can construct an element that equals all others C('x or y or z or a or ...')
, the only hash function that fulfills c1 == c2 ⇒ hash(c1) == hash(c2) is a constant one, i.e.
def __hash__(self):
return 0
I have just realized that my design requirement is faulty. The requirement that C('x') == C('x or y') is True and that C('y') == C('x or y') is True will also require that C('x') == C('y') will also be True. Looks that I need to re-think my design and maybe give up the ability to have object hashes. What do you think?
It is required that if
p1
ofc1
is 'x' andp1
ofc2
is 'x or y', thenc1 == c2
will returnTrue
.
That's very sketchy (i.e. probably bad) design. Equality is supposed to be transitive, so that if c1 == c2
and c2 == c3
, then c1 == c3
. Now, your specification requires that C('x') == C('x or y')
and C('x or y') == C('y')
, which should imply that C('x') == C('y')
- but you probably don't want that to be true. (And I see that you figured that out while I was writing this.)
What I would suggest is that you leave __eq__
alone and use a completely different method to perform these "fuzzy" comparisons, perhaps something like is_compatible_with
. Or if you are going to reimplement __eq__
, at least make it something sensible that obeys the transitive property, such as just comparing the string arguments. That may mean that __eq__
isn't very useful for your particular application, but that's okay; that's why you can create other methods.
The easiest solution would be to make all your objects return the same hash. This degrades the set O(1) performance to O(n) as the contained objects will all be inserted into the same slot. The discrimination will then be done using the __eq__
method.
Regarding your requirements, David has already replied extensively.
精彩评论