Most functions for generating lognormally distributed random numbers take the mean and standard deviation of the associated normal distribution as parameters.
My problem is that I only know the mean and the coefficient of variation of the lognormal distribution. It is reasonably straight forward to derive the parameters I need for the standard functions from what I have:
If mu
and sigma
are the mean and standard deviation of the associated normal distribution, we know that
coeffOfVar^2 = variance / mean^2
= (exp(sigma^2) - 1) * exp(2*mu + sigma^2) / exp(mu + sigma^2/开发者_如何转开发2)^2
= exp(sigma^2) - 1
We can rearrange this to
sigma = sqrt(log(coeffOfVar^2 + 1))
We also know that
mean = exp(mu + sigma^2/2)
This rearranges to
mu = log(mean) - sigma^2/2
Here's my R implementation
rlnorm0 <- function(mean, coeffOfVar, n = 1e6)
{
sigma <- sqrt(log(coeffOfVar^2 + 1))
mu <- log(mean) - sigma^2 / 2
rlnorm(n, mu, sigma)
}
It works okay for small coefficients of variation
r1 <- rlnorm0(2, 0.5)
mean(r1) # 2.000095
sd(r1) / mean(r1) # 0.4998437
But not for larger values
r2 <- rlnorm0(2, 50)
mean(r2) # 2.048509
sd(r2) / mean(r2) # 68.55871
To check that it wasn't an R-specific issue, I reimplemented it in MATLAB. (Uses stats toolbox.)
function y = lognrnd0(mean, coeffOfVar, sizeOut)
if nargin < 3 || isempty(sizeOut)
sizeOut = [1e6 1];
end
sigma = sqrt(log(coeffOfVar.^2 + 1));
mu = log(mean) - sigma.^2 ./ 2;
y = lognrnd(mu, sigma, sizeOut);
end
r1 = lognrnd0(2, 0.5);
mean(r1) % 2.0013
std(r1) ./ mean(r1) % 0.5008
r2 = lognrnd0(2, 50);
mean(r2) % 1.9611
std(r2) ./ mean(r2) % 22.61
Same problem. The question is, why is this happening? Is it just that the standard deviation is not robust when the variation is that wide? Or have I screwed up somewhere?
The results are not surprising. For distributions with large kurtosis, expected variance of the sample variance is roughly mu4/N, where mu4 is the 4th moment of the distribution. For a lognormal, mu4 exponentially depends on the parameter sigma^2, meaning that for large enough values of sigma, your sample variance will be all over the place relative to the true variance. This is precisely what you have observed. In your example, mu4/N ~ (coeffOfVar^8)/N ~ 50^8/1e6 ~ 4e7.
For derivation of the expected variance of sample var. see http://mathworld.wolfram.com/SampleVarianceDistribution.html. Below is some code to illustrate the ideas in a more exact manner. Note the large value of both the variance of the sample variance and its theoretical expected value, even for coeffOfVar = 5.
exp.var.of.samp.var <- function(n,mu2,mu4){
(n-1)*((n-1)*mu4-(n-3)*mu2^2)/n^3
}
mu2.lnorm <- function(mu,sigma){
(exp(sigma^2)-1)*exp(2*mu+sigma^2)
}
mu4.lnorm <- function(mu,sigma){
mu2.lnorm(mu,sigma)^2*(exp(4*sigma^2)+2*exp(3*sigma^2)+3*exp(2*sigma^2)-3)
}
exp.var.lnorm.var <- function(n,mu,sigma){
exp.var.of.samp.var(n,mu2.lnorm(mu,sigma),mu4.lnorm(mu,sigma))
}
exp.var.norm.var <- function(n,mu,sigma){
exp.var.of.samp.var(n,sigma^2,3*sigma^4)
}
coeffOfVar <- 5
mean <- 2
sigma <- sqrt(log(coeffOfVar^2 + 1)) # gives sigma=1.805020
mu <- log(mean) - sigma^2 / 2 # mu=-0.935901
n <- 1e4
m <- 1e4
## Get variance of sample variance for lognormal distribution:
var.trial <- replicate(m,var(rlnorm(n, mu, sigma)))
cat("samp. variance (mean of",m,"trials):",mean(var.trial),"\n")
cat("theor. variance:",mu2.lnorm(mu,sigma),"\n")
cat("variance of the sample var:",var(var.trial),"\n")
cat("expected variance of the sample var:",exp.var.lnorm.var(n,mu,sigma),"\n")
> samp. variance (mean of 10000 trials): 105.7192
> theor. variance: 100
> variance of the sample var: 350997.7
> expected variance of the sample var: 494053.2
## Do this with normal distribution:
var.trial <- replicate(m,var(rnorm(n, mu, sigma)))
cat("samp. variance (mean of",m,"trials):",mean(var.trial),"\n")
cat("theor. variance:",sigma^2,"\n")
cat("variance of the sample var:",var(var.trial),"\n")
cat("expected variance of the sample var:",exp.var.norm.var(n,mu,sigma),"\n")
> samp. variance (mean of 10000 trials): 3.257944
> theor. variance: 3.258097
> variance of the sample var: 0.002166131
> expected variance of the sample var: 0.002122826
精彩评论