开发者

Hashing uncertain objects

开发者 https://www.devze.com 2023-03-20 00:58 出处:网络
I need to design object that support some sort of uncertainty (or wild characters, if you wish) it its components.

I need to design object that support some sort of uncertainty (or wild characters, if you wish) it its components. The work is done in Python.

Consider the following class

class C():
    def __init__(self, p1):
        self.p1 = p1

The property p1 can be either "x", "y", "z", but sometimes "x or y", or any other combination.

It is required that if p1 of c1 is 'x' and p1 of c2 is 'x or y', then c1 == c2 will return True. This is easily achieved by supplying a proper __eq__ function. However, these objects need also to be stored in sets, therefore I need to supply a __hash__开发者_JAVA百科 function. How would you calculate hash function for this case, such if c1 == c2 then hash(c1) == hash(c2)?

Option 1: hashing the property

Not good Here's why

c1 = C('x')
c2  = C('x or y or z')
c1 == c2 #True
hash(c1) == hash(c2)#False


Your equality criterion is not transitive, and therefore invalid:

C('x') == C('x or y') == C('y')

but

C('x') != C('y')

Since you can construct an element that equals all others C('x or y or z or a or ...'), the only hash function that fulfills c1 == c2 ⇒ hash(c1) == hash(c2) is a constant one, i.e.

def __hash__(self):
    return 0


I have just realized that my design requirement is faulty. The requirement that C('x') == C('x or y') is True and that C('y') == C('x or y') is True will also require that C('x') == C('y') will also be True. Looks that I need to re-think my design and maybe give up the ability to have object hashes. What do you think?


It is required that if p1 of c1 is 'x' and p1 of c2 is 'x or y', then c1 == c2 will return True.

That's very sketchy (i.e. probably bad) design. Equality is supposed to be transitive, so that if c1 == c2 and c2 == c3, then c1 == c3. Now, your specification requires that C('x') == C('x or y') and C('x or y') == C('y'), which should imply that C('x') == C('y') - but you probably don't want that to be true. (And I see that you figured that out while I was writing this.)

What I would suggest is that you leave __eq__ alone and use a completely different method to perform these "fuzzy" comparisons, perhaps something like is_compatible_with. Or if you are going to reimplement __eq__, at least make it something sensible that obeys the transitive property, such as just comparing the string arguments. That may mean that __eq__ isn't very useful for your particular application, but that's okay; that's why you can create other methods.


The easiest solution would be to make all your objects return the same hash. This degrades the set O(1) performance to O(n) as the contained objects will all be inserted into the same slot. The discrimination will then be done using the __eq__ method.

Regarding your requirements, David has already replied extensively.

0

精彩评论

暂无评论...
验证码 换一张
取 消