开发者

What shuffling algorithms exist besides Fisher-Yates and finding the "next permutation?"

开发者 https://www.devze.com 2023-01-14 07:50 出处:网络
Specifically in the domain of one-dimensional sets of items of the same type, such as a vector of integers.

Specifically in the domain of one-dimensional sets of items of the same type, such as a vector of integers.

Say, for example, you had a vector of size 32,768 containing the sorted integers 0 through 32,767.

What I mean by "next permutation" is performing the next permutation in a lexical ord开发者_C百科ering system.

Wikipedia lists two, and I'm wondering if there are any more (besides something bogo :P)


O(N) implementation This is based on Eyal Schneider's mapping Zn! -> P(n)

def get_permutation(k, lst):
    N = len(lst)
    while N:
        next_item = k/f(N-1)
        lst[N-1], lst[next_item] = lst[next_item], lst[N-1]
        k = k - next_item*f(N-1)
        N = N-1
    return lst

It reduces his O(N^2) algorithm by integrating the conversion step with finding the permutation. It essentially has the same form as Fisher-Yates but replaces a call to random with the next step of the mapping. If the mapping is in fact a bijection (which I'm working to prove) then this is a better algorithm than Fisher-Yates because it only calls out to pseudo random number generator once and so will be more efficient. Note also that this returns the action of permutation (N! - k) rather than permutation k but that's of little consequence because if k is uniform on [0, N!], then so is N! - k.

old answer

This is slightly related to the idea of "next" permutation. If the items can be well ordered, then one can construct lexicographical ordering on the permutations. This allows you to construct a map from the integers into the space of permutations.

Then finding a random permutation is equivalent to choosing a random integer between 0 and N! and constructing the corresponding permutation. This algorithm will be as efficient as (and as difficult to implement) as calculating the n'th permutation of the set in question. This trivially gives a uniform choice of permutation if our choice of n is uniform.

A little more detail about ordering the permutations. given a set S = {a b c d}, mathematicians view the set of permutations of S as a group with the operation of composition. if p is one permutation, lets say (b a c d), then p operates on S by taking b to a, a to c, c to d and d to b. if q is another permutation, lets say (d b c a) then pq is obtained by first applying q and then p which gives (d a b)(c). for example, q takes d to b and p takes b to a so that pq takes d to a. You'll see that pq has two cycles because it takes b to d and fixes c. It's customary to omit 1-cycles but I left it in for clarity.

We're going to use some facts from group theory.

  1. disjoint cycles commute. (a b)(c d) is the same as (c d)(a b)
  2. we can arrange elements in a cycle in any cyclic order. that is (a b c) = (b c a) = (c a b)

So given a permutation, order the cycles so that the largest cycles come first. When two cycles are the same length, arrange their items so that the largest (we can always order a denumerable set, even if arbitrarily so) item comes first. Then we just have a lexicographical ordering first on the length of the cycles, then on their contents. This is well ordered because two permutations that consist of the same cycles must be the same permutation so if p > q and q > p then p = q.

This algorithm can be trivially executed in O(N!logN! + N!) time. just construct all the permutations (EDIT: Just to be clear, I had my mathematician hat on when I proposed this and it was tongue in cheek anyway) , quicksort them and find the n'th. It is a different algorithm than the two you mention though.


Here is an idea on how to improve aaronasterling's answer. It avoids generating all N! permutations and sorting them according to their lexicographic order, and therefore has a much better time complexity.

Internally it uses an unusual permutation representation, that simulates a selection & removal process from a shrinking array. For example, the sequence <0,1,0> represents a permutation resulting from removing item #0 from [0,1,2], then removing item #1 from [1,2], and then removing item #0 from [1]. The resulting permutation is <0,2,1>. With this representation, the first permutation will always be <0,0,...0>, and the last one will always be <N-1,N-2,...0>. I will call this special representation the "array representation".

Clearly, an array representation of size N can be converted to a standard permutation representation in O(N^2) time, by using an array and shrinking it when necessary.

The following function can be used to return the Kth permutation on {0,1,2...,N-1}, in the array representation:

getPermutation(k, N) {
    while(N > 0) {
        nextItem = floor(k / (N-1)!)
        output nextItem
        k = k - nextItem * (N-1)!
        N = N - 1
    }
}

This algorithm works in O(N^2) time (due to the representation conversion), instead of O(N! log N) time.

--Example--

getPermutation(4,3) returns <2,0,0>. This array representation corresponds to <C,A,B>, which is really the permutation at index 4 in the ordered list of permutations on {A,B,C}:

ABC
ACB
BAC
BCA
CAB
CBA


You can adapt merge sort such that it will shuffle the input randomly instead of sorting it.

In particular, when merging two lists, you choose the new head element at random instead of choosing it to be the smallest head element. The probability of choosing the element from the first list must be n/(n+m) where n is the length of the first and m the length of the second list for this to work.

I've written a detailed explanation here: Random Permutations and Sorting.


Another possibility is to build an LFSR or PRNG with a period equal to the number of items you want.


Start with a sorted array. Pick 2 random indexes, switch the elements at those indexes. Repeat O(n lg n) times.

You need to repeat O(n lg n) times to ensure that the distribution approaches uniform. (You need to make sure that each index is picked at least once, which is a balls-in-bins problem.)

0

精彩评论

暂无评论...
验证码 换一张
取 消