Generation of a lognormally distributed random number from the average value, coefficient of variation

Question

Generation of a lognormally distributed random number from the average value, coefficient of variation

Most functions for generating logarithmically distributed random numbers accept the mean and standard deviation of the associated normal distribution as parameters.

My problem is that I only know the average value and the coefficient of variation of the lognormal distribution. It is enough to simply get the parameters that I need for standard functions from what I have:

If mu and sigma are the mean and standard deviation of the associated normal distribution, we know that

 coeffOfVar^2 = variance / mean^2 = (exp(sigma^2) - 1) * exp(2*mu + sigma^2) / exp(mu + sigma^2/2)^2 = exp(sigma^2) - 1

We can change this to

 sigma = sqrt(log(coeffOfVar^2 + 1))

We also know that

 mean = exp(mu + sigma^2/2)

It rebuilds onto

 mu = log(mean) - sigma^2/2

Here is my implementation of R

 rlnorm0 <- function(mean, coeffOfVar, n = 1e6) { sigma <- sqrt(log(coeffOfVar^2 + 1)) mu <- log(mean) - sigma^2 / 2 rlnorm(n, mu, sigma) }

Works well for small coefficient of variation

 r1 <- rlnorm0(2, 0.5) mean(r1) # 2.000095 sd(r1) / mean(r1) # 0.4998437

But not for large values

 r2 <- rlnorm0(2, 50) mean(r2) # 2.048509 sd(r2) / mean(r2) # 68.55871

To verify that this is not a R-specific issue, I reimplemented it in MATLAB. (Uses the statistics toolbar.)

 function y = lognrnd0(mean, coeffOfVar, sizeOut) if nargin < 3 || isempty(sizeOut) sizeOut = [1e6 1]; end sigma = sqrt(log(coeffOfVar.^2 + 1)); mu = log(mean) - sigma.^2 ./ 2; y = lognrnd(mu, sigma, sizeOut); end r1 = lognrnd0(2, 0.5); mean(r1) % 2.0013 std(r1) ./ mean(r1) % 0.5008 r2 = lognrnd0(2, 50); mean(r2) % 1.9611 std(r2) ./ mean(r2) % 22.61

Same problem. The question is, why is this happening? Is it just that the standard deviation is not reliable when the variation is so wide? Or am I screwing something?

+4

r statistics matlab

Richie cotton Mar 26 '10 at 10:13

source share

1 answer

Leo Alekseyev · Accepted Answer · 2010-03-26T12:34:07+0000

The results are not surprising. For distributions with a large excess, the expected variance of the sample variance is approximately equal to mu4 / N, where mu4 is the 4th propagation moment. For lognormal mu4, it exponentially depends on the sigma ^ 2 parameter, which means that for sufficiently large sigma values your sample variance will be all over the place relative to the true variance. This is exactly what you observed. In your example, mu4 / N ~ (coeffOfVar ^ 8) / N ~ 50 ^ 8 / 1e6 ~ 4e7.

To output the expected variance of the sample var. see http://mathworld.wolfram.com/SampleVarianceDistribution.html . Below is some code to more accurately illustrate ideas. Note the great importance of both the variance of the variance of the sample and its theoretical expected value, even for coeffOfVar = 5.

 exp.var.of.samp.var <- function(n,mu2,mu4){ (n-1)*((n-1)*mu4-(n-3)*mu2^2)/n^3 } mu2.lnorm <- function(mu,sigma){ (exp(sigma^2)-1)*exp(2*mu+sigma^2) } mu4.lnorm <- function(mu,sigma){ mu2.lnorm(mu,sigma)^2*(exp(4*sigma^2)+2*exp(3*sigma^2)+3*exp(2*sigma^2)-3) } exp.var.lnorm.var <- function(n,mu,sigma){ exp.var.of.samp.var(n,mu2.lnorm(mu,sigma),mu4.lnorm(mu,sigma)) } exp.var.norm.var <- function(n,mu,sigma){ exp.var.of.samp.var(n,sigma^2,3*sigma^4) } coeffOfVar <- 5 mean <- 2 sigma <- sqrt(log(coeffOfVar^2 + 1)) # gives sigma=1.805020 mu <- log(mean) - sigma^2 / 2 # mu=-0.935901 n <- 1e4 m <- 1e4 ## Get variance of sample variance for lognormal distribution: var.trial <- replicate(m,var(rlnorm(n, mu, sigma))) cat("samp. variance (mean of",m,"trials):",mean(var.trial),"\n") cat("theor. variance:",mu2.lnorm(mu,sigma),"\n") cat("variance of the sample var:",var(var.trial),"\n") cat("expected variance of the sample var:",exp.var.lnorm.var(n,mu,sigma),"\n") > samp. variance (mean of 10000 trials): 105.7192 > theor. variance: 100 > variance of the sample var: 350997.7 > expected variance of the sample var: 494053.2 ## Do this with normal distribution: var.trial <- replicate(m,var(rnorm(n, mu, sigma))) cat("samp. variance (mean of",m,"trials):",mean(var.trial),"\n") cat("theor. variance:",sigma^2,"\n") cat("variance of the sample var:",var(var.trial),"\n") cat("expected variance of the sample var:",exp.var.norm.var(n,mu,sigma),"\n") > samp. variance (mean of 10000 trials): 3.257944 > theor. variance: 3.258097 > variance of the sample var: 0.002166131 > expected variance of the sample var: 0.002122826

Generation of a lognormally distributed random number from the average value, coefficient of variation

More articles: