The Fisher-Yates shuffle gives a nice algorithm to shuffle an array A
of length n
in a single pass:
For k = 1 to n
Pick a random integer j from k to n
Swap A[k] and A[j]
After a single pass through this algorithm, the entries of A
occur uniformly at random.
A common way to botch this algorithm is to do the following:
For k = 1 to n
Pick a random integer j from 1 to n
Swap A[k] and A[j]
The resulting distribution from a single pass through this algorithm is not uniformly random, and there is a nice discussion of what it is at this post: What distribution do you get from this broken random shuffle?
I recently read a delightful article by Diaconis, Fulman and Holmes entitled Analysis of Casino Shelf Shuffling Machines where the authors describe a physical machine that does the following batch shuffle:
For k = 1 to n
Pick a random integer j from 1 to 10
Randomly choose to place card k on the top or bottom of stack j
The question the authors address is whether or not this gives a reasonably random ordering after a single pass. The answer is decidedly not. One way to see the flaw in this shuffle is to start with a deck of cards that has n/2
red cards atop of n/2
black cards. The resulting deck after a single pass will have at most 10 clumps of red cards! For n = 52*6
, this isn't terribly random. The authors also show that an optimal "guess the next card" strategy for the once shuffled will, on average, correctly guess 9.5 cards, whereas an optimal strategy for a random deck will average only 4.5 cards correctly guessed.
Are there any other interesting single-pass shuffles that achieve near-randomness and/or interesting distributions? I'm especially interested in shuffles similar to the latter that work with batches of entries.
If you have a shuffled desk, into which you wish to shuffle a batch of new cards (and you know that none of the cards are duplicates), then I think the following is valid.
ForEach card in batch:
gap = random(deck.size() + 1) # choose a gap between cards, before first, or after last.
deck.insertAt(gap,card)
Distribution
The distribution of random is uniform, and the order of the deck is unchanged, so still uniform. I think the result should be uniform. (My stats is too rusty to be sure).
Time
Assuming that insertAt is O(1) not O(N) - which depends upon the implementeation of deck - the whole routine is O(batch size) - which is the best you can hope for becuase you have to handle each card.
精彩评论