I was trying to program the algorithm for the cdf for the multivariate t-distribution following Genz and Bretz, T开发者_如何学Che reference package in R is mvtnorm.
When I was testing my function, I found that my numbers don't match up. In the following example, adjusted from the mvtnorm help, the multivariate t random variable has independent components. So the integral should be just the product of 3 independent probabilities
> lower <- -1
> upper <- 3
> df <- 4
> corr <- diag(3)
> delta <- rep(0, 3)
> pmvt(lower=lower, upper=upper, delta=delta, df=df, corr=corr)
[1] 0.5300413
attr(,"error")
[1] 4.321136e-05
attr(,"msg")
[1] "Normal Completion"
The reported error is 4e-5, the error compared to the product of independent probabilities
> (pt(upper, df) - pt(lower, df))**3
[1] 0.4988254
is
0.5300413 - 0.4988254 = 0.0312159
I'm getting discrepancies in my own code compared to R mvtnorm for various examples in about the same range.
I'm mostly a beginner in R. So, what am I doing wrong or what is wrong?
(I'm not signed up on a R-help mailing list, so I try here.)
UPDATE: As pchalasani explained, my statistics was wrong, the bug in my own code was in some helper function not in the t distribution code. A good way of seeing that being uncorrelated does not imply independence, is looking at the conditional distribution. Here are the column frequencies %*100 for independent a bivariate random variable (10000 samples) for quartiles (distribution conditional on column variable).
bivariate uncorrelated normal variates
([[26, 25, 24, 23],
[24, 23, 24, 25],
[24, 27, 24, 24],
[24, 23, 26, 25]])
bivariate uncorrelated t variates
([[29, 20, 22, 29],
[20, 31, 28, 21],
[20, 29, 29, 20],
[29, 18, 18, 29]])
The distribution in the first and last column is very different from the middle columns. (Sorry, no R code, since I don't know how to do this quickly with R.)
Zero Correlation does not imply independence, for jointly non-Gaussian distributed random variables!
Let me elaborate: there is no bug here. The flaw lies in your assumption that when the multivariate Student-t random variables are uncorrelated, they are also independent, which is definitely not the case: the only class of multiavariate distributions where no correlation implies independence, is the MV Gaussian distribution.
To see that two uncorrelated random variables that jointly follow a MV Student-T distribution are not independent, consider the case of n=2
:
require(mvtnorm)
x <- rmvt(100000, sigma = diag(2), df=4, delta = rep(0,2) )
Now each column of x
represents realizations of the two random variables. We first check that their correlation is fairly small:
> cor(x[,1], x[,2])
[1] -0.003378811
However the correlation of the squares of x[,1]
and x[,2]
is as high as 30.4%, i.e., definitely not zero, proving that x[,1]
and x[,2]
are not statistically independent:
> cor(x[,1]^2, x[,2]^2)
[1] 0.3042684
精彩评论