During an assignment, I was asked to show that a hash table of size m (m>3, m is prime) that is less than half full, and that uses quadratic checking (hash(k, i) = (h(k) + i^2) mod m
) we will always find a free spot.
I've checked and arrived to the conclusion that the spots that will be found (when h(k)=0)开发者_高级运维 are 0 mod m
, 1 mod m
, 4 mod m
, 9 mod m
, ...
Can anyone please hint me towards the way to solve this?
Thanks!
0, 1, 4, ..., ((m-1)/2)^2 are all distinct mod m. Why?
Suppose two numbers from that range, i^2 and j^2, are equivalent mod m.
Then i^2 - j^2 = (i-j)(i+j) = 0 (mod m). Since m is prime, m must divide one of those factors. But the factors are both less than m, so one of them ((i-j)) is 0. That is, i = j.
Since we are starting at 0, more than half the slots that are distinct. If you can only fill less than m/2 of them, at least one remains open.
Let's break the proof down.
Setup
First, some background.
With a hash table, we define a probe sequence
P
. For any itemq
, followingP
will eventually lead to the right item in the hash table. The probe sequence is just a series of functions{h_0, ..., h_M-1}
whereh_i
is a hash function.To insert an item
q
into the table, we look ath_0(q)
,h_1(q)
, and so on, until we find an empty spot. To findq
later, we examine the same sequence of locations.
In general, the probe sequence is of the form h_i(q) = [h(q) + c(i)] mod M
, for a hash table of size M
, where M
is a prime number. The function c(i)
is the collision-resolution strategy, which must have two properties:
First, c(0) = 0
. This means that the first probe in the sequence must be equal to just performing the hash.
Second, the values {c(0) mod M, ..., c(M-1) mod M}
must contain every integer between 0 and M-1. This means that if you keep trying to find empty spots, the probe sequence will eventually probe every array position.
Applying quadratic probing
Okay, we've got the setup of how the hash table works. Let's look at quadratic probing. This just means that for our c(i)
we're using a general quadratic equation of the form ai^2 + bi + c
, though for most implementations you'll usually just see c(i) = i^2
(that is, b, c = 0
).
Does quadratic probing meet the two properties we talked about before? Well, it's certainly true that c(0) = 0
here, since (0)^2
is indeed 0
, so it meets the first property. What about the second property?
It turns out that in general, the answer is no.
Theorem. When quadratic probing is used in a hash table of size
M
, whereM
is a prime number, only the firstfloor[M/2]
probes in the probe sequence are distinct.
Let's see why this is the case, using a proof by contradiction.
Say that the theorem is wrong. Then that means there are two values
a
andb
such that0 <= a < b < floor[M/2]
that probe the same position.h_a(q)
andh_b(q)
must probe the same position, by (1), soh_a(q) = h_b(q)
.h_a(q) = h_b(q) ==> h(q) + c(a) = h(q) + c(b)
,mod M
.The
h(q)
on both sides cancel. Ourc(i)
is justc(i) = i^2
, so we havea^2 = b^2
.Solving the quadratic equation in (4) gives us
a^2 - b^2 = 0
, mod M. This is a difference of two squares, so the solution is(a - b)(a + b) = 0
,mod M
.But remember, we said
M
was a prime number. The only way that(a - b)(a + b)
can be zeromod M
is if [case I] (a - b) is zero, or [case II] (a + b) is zeromod M
.Case I can't be right, because we said that
a != b
, soa - b
must be something other than zero.The only way for
(a + b)
to be zeromod M
is fora + b
to be equal to be a multiple ofM
or zero. They clearly can't be zero, since they're both bigger than zero. And since they're both less thanfloor[M/2]
, their sum must be less thanM
. So case II can't be right either.
Thus, if the theorem were wrong, one of two quantities must be zero, neither of which can possibly be zero -- a contradiction! QED: quadratic probing doesn't satisfy property two once your table is more than half full and if your table size is a prime number. The proof is complete!
From Wikipedia:
For prime m > 2, most choices of c1 and c2 will make h(k,i) distinct for i in [0,(m − 1) / 2]. Such choices include c1 = c2 = 1/2, c1 = c2 = 1, and c1 = 0,c2 = 1. Because there are only about m/2 distinct probes for a given element, it is difficult to guarantee that insertions will succeed when the load factor is > 1/2.
See the quadratic probing section in Data Structures and Algorithms with Object-Oriented Design Patterns in C++ for a proof that m/2 elements are distinct when m is prime.
精彩评论