开发者

Random numbers probability

开发者 https://www.devze.com 2023-01-15 10:11 出处:网络
I am trying to randomly choose from e.g. 4 numbers. I need to compare the probability of these 2 algorithms.

I am trying to randomly choose from e.g. 4 numbers. I need to compare the probability of these 2 algorithms.

1#

                int a = random.Next(0, 4);

 开发者_运维百科               if (a = 0)
                    statement1
                if (a = 1) 
                    statement2
                if (a = 2)
                    statement3
                if (a = 3) 
                    statement4

2#

                int a = random.Next(0, 1000)

                if (a < 250)
                    statement1
                if (a >= 250 && a < 500) 
                    statement2
                if (a >= 500 && a < 750)
                    statement3
                if (a >= 750) 
                    statement4

Am I right if I think that it is the same ? The probability of statement1 in the first code is 1/4 and in the second code it is 250/1000 so it’s 1/4 too. But someone has told me when I use bigger range of random numbers like in code 2# it’s statistically more accurate. I’ve made project which repeats many times those codes, but I’m not sure it shows me some results.


They are exactly equivalent (except for the fact that the first one won't compile due to using = instead of == in the if-clauses).

To prove this, look at the implementation of Random.Next(int, int). With your values, Random.Next(0, 4) is

(int) (Random.Sample() * 4)

and

Random.Next(0, 1000) is

(int) (Random.Sample() * 1000)

, where Random.Sample() is a private method that returns a random double.

It should now be easy to see that Random.Next(0, 4) will return 0 exactly when Random.Next(0, 1000) will return a number between 0 and 250.


Pseudorandom numbers should be evenly distributed no matter what the range is. If, in your second example, if you just choose the last 4 bits (a & 3), you will get the same distribution as if you choose the next 4 with (a>>2) & 3. I.e. what you are algorithmically doing in the second example using ranges, is discarding a lot of the information the random generator has given you. You get no more "randomness" with a larger range.

Having said this, pseudorandom generators do have their idiosyncracies, but unless you are serious about this it's not worth worrying about!


The distribution is uniform and it's easy to verify:

public class Program
{
    static void Main(string[] args)
    {
        var random = new Random();
        const int iterations = 10000000;

        var hits1 = 1.0 * Enumerable.Range(1, iterations)
                                     .Select(i => random.Next(0, 4))
                                     .Where(i => i == 0).Count();
        Console.WriteLine(hits1 / iterations);

        var hits2 = 1.0 * Enumerable.Range(1, iterations)
                                     .Select(i => random.Next(0, 1000))
                                     .Where(i => i < 250)
                                     .Count();
        Console.WriteLine(hits2 / iterations);
    }
}


My tests are as follows

Out of a 10K loop 2 tests was run with a range 1-4 and a range 1-1000, heres the results

1-4

  1 > 2484 times
  2 > 2519 times
  3 > 2511 times
  4 > 2487 times

0 - 1000

  1 - 250    > 2421 times
  250 - 500  > 2531 times
  500 - 750  > 2529 times
  750 - 1000 > 2490 times

my conclusion is that they make no difference what so ever, you have to get into matrix's and so forth to have some control over random number generation and so forth.

Note: my tests was done with PHP and source code is below.


<?php

$first = array(1=>0,2=>0,3=>0,4=>0);
$second = array('0 - 250' => 0, '250 - 500' => 0, '500 - 750' => 0,'750 - 1000' => 0);

for($i=0;$i<=10000;$i++)  //10K
{
    //First
    $f_number = rand(1,4);
    switch($f_number)
    {
        case 1: $first[$f_number]++; break;
        case 2: $first[$f_number]++; break;
        case 3: $first[$f_number]++; break;
        case 4: $first[$f_number]++; break;
    }

    //Second
    $s_number = rand(1,1000);
    if($s_number < 250) $second['0 - 250']++;
    if($s_number > 250 && $s_number < 500) $second['250 - 500']++;
    if($s_number > 500 && $s_number < 750) $second['500 - 750']++;
    if($s_number > 750) $second['750 - 1000']++;
}

var_dump($first,$second);
?>
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号