开发者

Where does that randomness come from?

开发者 https://www.devze.com 2023-01-01 07:34 出处:网络
I\'m working on a data mining research project and use code from a big svn. Apparently one of the methods I use from that svn uses randomness somewhere without asking for a seed, which makes 2 calls

I'm working on a data mining research project and use code from a big svn.

Apparently one of the methods I use from that svn uses randomness somewhere without asking for a seed, which makes 2 calls to my program return different results. That's annoying for what I want to do, so I'm trying to locate that "uncontrolled" randomness.

Since the classes I use depend on many other, that's pretty painful to do by hand. Any idea how I could find where that randomness comes from ?

Edit:

Roughl开发者_开发问答y, my code is structured as :

- stuff i wrote
- call to a method I didnt write involving lots of others classes
- stuff i wrote

I know that the randomness is introduced in the method I didn't write, but can't locate where exactly...

Idea:

What I'm looking for might be a tool or Eclipse plug-in that would let me see each time Random is instantiated during the execution of my program. Know anything like that ?


The default seed of many random number generators is the current time. If it's a cryptographic random number generator, it's a seed that's far more complex than that.

I'd bet that your random numbers are probably being seeded with the current time. The only way to fix that is to find the code that creates or seeds the random number generator and change it to seed to a constant. I'm not sure what the syntax of that is in Java, but in my world (C#) it's something like:

Random r = new Random(seedValue);

So even with an answer from StackOverflow, you still have some detective work to do to find the code you want.


Maybe it's a bit old-fashioned style, but...

How about tracing the intermediate results (variables, functions arguments) to standard output, gathering inputs for two different runs and checking where do they start to differ?


Maybe you want to read this:

In Java, when you create a new Random object, the seed is automaticly set to the system clocks "current time" in nanoseconds. So, when you check out the source of the Random class you will see a constructor, something like this:

public Random()
{
    this(System.nanoTime());
}

Or maybe this:

In Eclipse you can set your cursor in a variable and then press F3 or F2 (I don't know exactly). This will bring you to the point where this variable is declared.

A second tool you can use is "Find usages". Then your IDE will search to all usages of a method, a variable or variable or whatever you want.


Which "big svn" are you using?

You could write some simple tests, to test whether or not two identical calls to underlying functions return two identical results...

Unless you know where the Random object is created, you're going to have to do some detective work this way.

How much of this code is open to you?


Why don't you insert a lot of logging calls (e.g. to standard error) that trace the state of the value you are concerned about throughout the program.

You can compare the trace across two successive runs to narrow down where the randomness is happening by searching for the first difference in the two log files.

Then you can insert more logging calls in that area until you precisely identify the problem.


Java's "Set" class implementations do not guarantee that they iterate the elements the same order. Thus, even if you run a program on the same machine twice, the order in which a set is traversed may change. Can't do anything about it unless one changes all "set" uses into "lists".

0

精彩评论

暂无评论...
验证码 换一张
取 消