开发者

Equals method benifit for hashtable implementation in java?

开发者 https://www.devze.com 2023-03-26 00:16 出处:网络
For the benefit of hashtable we have two methodshashcode and equ开发者_开发知识库als.Internally when we add a key value pair in hastable first it goes inside hashcode method of keyand checks if it is

For the benefit of hashtable we have two methods hashcode and equ开发者_开发知识库als.Internally when we add a key value pair in hastable first it goes inside hashcode method of key and checks if it is equal to hashcode value of any previous key. If it is not then it simply add key value pair in hashtable but if it is equal then it goes inside equals method of key where we provide again some logic to check if the objects are equal.So my Question here is the work we are doing in equals method we can eliminate that and put the same kind of logic inside hashcode method where we provide different hashcode (depending upon the logic we are putting in equals method). In that way we can manage the hashtable with hashcode mthod only and eliminate the need of equals method.

Take the example of Employee class where we have id,salary and name as its state.We are using Employee as key in hashtable. So we override the hashcode in a way that suffice the need of hashcode and equals method both.So need of equal method.

I know I am missing something here. Looking for it.


Yes, you're missing something.

First: hashCode returns an int, and can thus only return 2^32 different values. equals is thus needed to be able to differentiate between values which have identical hash codes.

Second: the hash table uses the hashCode modulo the number of buckets it maintains. So, even if two keys have different hashCodes, they might fall in the same bucket, and equals will be necessary to differentiate them.


The problem is that you can't guarantee (as a general condition) that the hashcode will always be unique.

You might be able to make a single class that can, for example Employee should be uniquely identified by employeeId. There would be no reason your hashcode could not simply be return employeeId; - you would guarantee uniqueness that way.

But, a general object will have much more. Consider a coordinate class

class Coordinate {
    int x;
    int y;
    int z;

    public boolean equals(Object o) {
        if(o instanceof Coordinate) {
            Coordinate c = (Coordinate)o;
            return x == c.x && y == c.y && z == c.z;
        }
        return false;
    }

    public int hashCode() {
        return x ^ y ^ z;
    }
}

Your x y and z would make for 2^96 different combinations of uniqueness, but only 2^32 possible hashcodes. For example 1,2,3 vs 3,2,1 would both be the same. Now you could improve this to make the hashcode something like

public int hashCode() { int c = x; c *= 31 + y; c *= 31 + z; return c; }

But this wouldn't get rid of the problem - you'd still be able to come up with thousands of combinations that would cause a hashcode collision.

But fear not - there are such things as what you describe: they're called Perfect Hashes


The problem is that hashCode() returns an int, and there are only 2^32 different hashcodes. Therefore, for classes with more than 2^32 different states (i.e. pretty much everything), you cannot avoid returning the same hashcode for some objects even though they are not equal.


The thing you're missing is that some data cannot be uniquely represented by a finite integer. A String is an example.

Also, equals isn't used only for when the hashCodes are the same. Elements are put into a "bucket" that usually covers millions of possible hashCode values (using the modulo operator). So even if every possible object had a unique hashCode you'd still need to double check everything.


So my Question here is the work we are doing in equals method we can eliminate that and put the same kind of logic inside hashcode method where we provide different hashcode (depending upon the logic we are putting in equals method).

The equals method is used to prevent duplicate keys from being inserted into a Map (if you go by the API documentation); this includes HashMaps and HashTables. The hashcode method on the other hand is used to optimize lookups, but cannot be relied on to compare equality of two keys as there is the possibility of hash collisions. The Map documentation specifically states:

Implementations are free to implement optimizations whereby the equals invocation is avoided, for example, by first comparing the hash codes of the two keys.

In the event of hash collisions among keys, a single bucket will store two or more values for two different keys, and the bucket must be traversed sequentially to find the value matching the key, which is the worst case. That's why the use of hashcode for comparison is an optimization, as the actual value matching the key can be obtained only via the equals methods. Note that, this assumes that the same fields used to calculate hashcode is also used to compare for equality.

0

精彩评论

暂无评论...
验证码 换一张
取 消