Fast random selection algorithm_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2022-12-13 07:56 出处：网络

Given an array of true/false values, what is the most efficient algorithm to select an index with a true value at random.

A sketch simple algorithm is

a <- the 开发者_StackOverflow中文版array
c <- 0
for i in a:
    if a[i] is true: c++
e <- random number in (0, c-1)
j <- 0
for i in e:
    while j is false: j++
return j

Can anyone come up with a faster algorithm? Maybe there is a way to only walk through the list once even if the number of true elements is not known at first?

Use the "pick a random element from an infinite list" algorithm.

Keep an index of your current pick, and also a count of how many true values you've seen.

When you see a true value, increment the count and then replace your pick with the current index with a probability of P=(1/count). (So you always pick the first one you find... then you might switch to the second one, with probability 1/2, then you might switch to the third one with probabilty 1/3 etc.)

This requires only one scan over the list and constant storage. (It does require you to work out a larger number of random numbers, however.) In particular, it doesn't ever require you to either buffer the list or go back to the start - so it can work on an unbounded input stream.

See this answer for a sample LINQ implementation of the simple "pick a random element" algorithm; it would just need minor tweaks.

Build a list with indexes that point to true values and select one of those at random. Requires O(n) for list traversal and one try for the random number.