I have written a function that computes the Kullback-Leibler divergence from N(mu2, sigma2) to N(0, 1).
开发者_开发技巧mu1 <- 0
sigma1 <- 1
f <- function(mu2, sigma2)
{
g <- function(x)
{
(dnorm(x, mean=mu1, sd=sigma1, log=TRUE) -
dnorm(x, mean=mu2, sd=sigma2, log=TRUE)) *
dnorm(x, mean=mu1, sd=sigma1)
}
return(integrate(g, -Inf, Inf)$value)
}
For example, the KL divergence from N(5, 1) to N(0, 1) is
> f(5, 1)
[1] 12.5
I am sure that this result is correct because I computed at hand a closed form expression that gives the KL divergence from N(mu2, sigma2) to N(mu1, sigma1).
My question is about the KLdiv function from the flexmix package. Why doesn't it yield the same result ? What does it actually compute ?
> library(flexmix)
> x <- seq(-4, 12, length=200)
> y <- cbind(norm1=dnorm(x, mean=0, sd=1), norm2=dnorm(x, mean=5, sd=1))
> KLdiv(cbind(y))
norm1 norm2
norm1 0.000000 7.438505
norm2 7.438375 0.000000
Instead of using KLdiv, what do you think of the following procedure :
> x <- rnorm(1000)
> dist <- mean(dnorm(x, mean=0, sd=1, log=TRUE)) -
+ mean(dnorm(x, mean=5, sd=1, log=TRUE))
> print(dist)
[1] 12.40528
???
Thank you in advance !
In the last part you write
x <- rnorm(1000)
dist <- mean(dnorm(x, mean=0, sd=1, log=TRUE)) -
mean(dnorm(x, mean=5, sd=1, log=TRUE))
print(dist)
[1] 12.40528
This is the divergence for a random sample of size 1000. The closed form expression is the limiting value as sample size goes to infinity. If you change your sample size you will get closer. or if you do the same calculation repeatedly you can see that the mean of the estimates is 12.5 like you want.
Check the eps
parameter in the manual page ?KLdiv,matrix-method
:
> KLdiv(cbind(y),eps=1e-16)
norm1 norm2
norm1 0.00000 12.49908
norm2 12.49941 0.00000
精彩评论