开发者

Anyone understand the quality of hash?

开发者 https://www.devze.com 2023-03-23 17:16 出处:网络
The \"quality\" of a hash is defined as the total number of comparisons needed to access every element once, relative to the expected number needed for a random hash. The value can go over 100%.

The "quality" of a hash is defined as the total number of comparisons needed to access every element once, relative to the expected number needed for a random hash. The value can go over 100%.

The total number of comparisons is equal to the sum of the squ开发者_开发问答ares of the number of entries in each bucket. For a random hash of "<n"> keys into "<k"> buckets, the expected value is:

n + n ( n - 1 ) / 2 * k

What exactly is the quality of hash??


It is a measure for how "evenly distributed" the hash is. Ideally, the hash function would place everything into its own bucket, but that does not happen because you cannot have that many buckets (and even then there are hash collisions, so that distinct values still end up in the same bucket).

The performance of the hash (ideally just going to up a bucket and looking at the single element in there) degrades when you have buckets with many elements in them: If that happens, you have to linearly go through all of them.

A quality of 100% is what you would expect for a hash filled with random data. In that case, all buckets should be equally full. If you have more than 100%, your data is unevenly hashed, and lookups take more time.

0

精彩评论

暂无评论...
验证码 换一张
取 消