HashSet look-up complexity?_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-17 09:36 出处：网络

A look-up operatio开发者_如何学Gon OR contains for single can be O(n) in worst-case right ? So, for n elements look up in hashSet will be O(n^2)?Yes, but it\'s really the worst case: if all the elemen

A look-up operatio开发者_如何学Gon OR contains for single can be O(n) in worst-case right ? So, for n elements look up in hashSet will be O(n^2)?

Yes, but it's really the worst case: if all the elements in the HashSet have the same hash code (or a hash code leading to the same bucket). With a correctly written hashCode and a normally distributed key sample, a lookup is O(1).

Yes, but the whole reason we have HashSets is that we encounter this worst case with very, very low probability, and it's usually much faster than the guaranteed nlogn for a heap or a (self-balancing) TreeSet, or the guaranteed n^2 for an unsorted list.

As already noted in the earlier answers, the lookup time complexity is O(1). To make make sure that it's true, just look into a source code for contains():

...

private transient HashMap<E,Object> map;

...

public boolean contains(Object o) {
    return map.containsKey(o);
}

...

As you can see, it uses a HashMap object internally, to check if your object exists.

Then, if we take a look into an implementation of contains() for HashMap, then we'll see the code as follows:

public boolean containsKey(Object key) {
    return getNode(hash(key), key) != null;
}

getNode() searches for a node based on a key hash value and a key value. Please note that hash(key) has an O(1) time complexity.

And finally, for getNode():

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; 
    Node<K,V> first, e; 
    int n; 
    K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

The most important part is basically the first inner if block:

...
        if (first.hash == hash &&
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
...

If hashes of your object key and that of the first element first are equal, and objects themselves are equal (obviously!), then first is the object we're looking for, and this is O(1).

As you can see, it all depends on the implementation of the hash function - if it's good, then it will mostly assign different buckets for different key objects. If not, then several key objects may reside in the same bucket, and so we will need to do a lookup in the bucket itself to find the right key as seen here:

...
        if (first instanceof TreeNode)
            return ((TreeNode<K,V>)first).getTreeNode(hash, key);
        do {
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
...
        } while ((e = e.next) != null);

However, even in this case, if your bucket is a TreeNode it is O(log(k)) (k - number of elements in the bucket) because it's a balanced binary search tree. If not (the else block), it's O(k). But again, this will happen rarely (or maybe even never for some types of objects), and so the average time complexity for one call of the contains method will remain O(1). Obviously, if you perform n calls then the total time complexity will be linear.

lookp takes O(c)

c = constant value