Is there any way to randomly generate a set of positive numbers such that they have a desired mean and standard deviation?
I have an algorithm to generate numbers with a gaussian distribution, but I don't know how to deal with negative numbers in a way the preserves the mean and standard deviation.
It looks like a poisson distribution might be a good approximation, but it takes only a mean.EDIT: There's been some confusion in the responses so I'll try to clarify.
I have a set of numbers that give me a mean and a standard deviation. I would like to generate an equally sized se开发者_C百科t of numbers with an equivalent mean and standard deviation. Normally, I would use a gaussian distribution to do this, however in this case I have an additional constraint that all values must be greater than zero.
The algorithm I'm looking for doesn't need to be gaussian-based (judging by the comments so far, it probably shouldn't be) and doesn't need to be perfect. It doesn't matter if the resulting number set has a slightly different mean/standard deviation -- I just want something that will usually be in the ballpark.
You may be looking for log-normal distribution, as David Norman suggested, or maybe exponential, binomial, or some other distribution. If you have an algorithm to generate one distribution, it is probably not good for generating numbers conforming to another distribution. But only you know how your numbers are really distributed.
With normal distribution, the random variable's range is from negative infinity to positive infinity, so if you're looking for positive numbers only, then it is not Gaussian.
Different distributions also have unique properties, for example, with Poisson distribution, the standard deviations is always equal to the mean. (That's why your library function doesn't ask from the standard deviation parameter, only the mean).
In the worst case, you could generate a random real number between 0 and 1 and compute the probability density function on your own. (Depending on the distribution, this may be much easier said than done).
You could use a log-normal distribution.
First, you can't generate only positive values from a Gaussian distribution.
Second, am I understanding correctly that you are trying to generate a random distribution with given mean and standard deviation? Will any distribution do? If so, let mean = m
and standard deviation = s
. I am assuming that m - s > 0
.
let n = random integer modulo 2;
if n equals 0 return m - s
else return m + s
The values returned by this process will have mean m
and standard deviation s
.
Why not use a resampling method? If you have n numbers in your sample, just take n random draws from the sample, with replacement. The resulting set will have expected mean and variance about the same as your original sample, but it will usually be slightly different.
This said, without knowing why you need more random numbers, it's impossible to say what the right answer is. One wonders if you're trying to solve the wrong problem...
I couldn't resist - I really like Jason's angle but wasn't happy that his answer only covers cases where m > s, so I worked out a general solution following his idea.
The most simple distribution with given m,s and positive terms is
with probability p, return 0
with probability (1-p), return m / (1-p)
where (1-p) = m^2 / (m^2 + s^2)
Proof: for a distribution X with two outcomes lowX with probability p and highX with probability (1-p),
m = E[X] = p x lowX + (1-p) x highX
s^2 = Variance(X) = E[X^2] - E[X]^2 = p x lowX^2 + (1-p) x highX^2 - m^2
Set lowX to 0 and resolve in highX and p.
You could use any distribution which has positive support AND can be specified by mean and variance. For example,
- one-parameter distributions won't work in general. For example chi-square won't work unless your variance is always double its mean. Similarly exponential won't work unless your variance equals your mean squared.
- some two-parameter distributions won't work in some cases. Binomial distribution won't work unless variance is less than your mean. Similarly the non-central chi-square won't work unless your variance is greater than 2 times your mean and less than 4 times your mean!
- However log-normal and gamma will work in all cases.
If i understand you correctly you want to generate random numbers from a distribution with positive support. There are many possible choices. The simplest is the
chi-square: http://en.wikipedia.org/wiki/Chi-square_distribution (which is just the sum of two squared gaussians)
All the assymetric distribution (exponential, weibull, pareto, Inverse Gaussian, log-normal, Gamma)
All the distributions from the skew familly (skew-normal, skew-student,...)
All the above functions are such that any random number drawn from any of them will allways be positive.
精彩评论