I am currently using python and RPY to use the functionality inside R.
How do I use R library to generate Monte carlo samples that honor the c开发者_如何学JAVAorrelation between 2 variables.. e.g if variable A and B have a correlation of 85% (0.85), i need to generate all the monte carlo samples honoring that correlation between A & B.
Would appreciate if anyone can share ideas / snippets
Thanks
The rank correlation method of Iman and Conover seems to be a widely used and general approach to producing correlated monte carlo samples for computer based experiments, sensitivity analysis etc. Unfortunately I have only just come across this and don't have access to the PDF so don't know how the authors actually implement their method, but you could follow this up.
Their method is more general because each variable can come from a different distribution unlike the multivariate normal of @Dirk's answer.
Update: I found an R implementation of the above approach in package mc2d
, in particular you want the cornode()
function.
Here is an example taken from ?cornode
> require(mc2d)
> x1 <- rnorm(1000)
> x2 <- rnorm(1000)
> x3 <- rnorm(1000)
> mat <- cbind(x1, x2, x3)
> ## Target
> (corr <- matrix(c(1, 0.5, 0.2, 0.5, 1, 0.2, 0.2, 0.2, 1), ncol=3))
[,1] [,2] [,3]
[1,] 1.0 0.5 0.2
[2,] 0.5 1.0 0.2
[3,] 0.2 0.2 1.0
> ## Before
> cor(mat, method="spearman")
x1 x2 x3
x1 1.00000000 0.01218894 -0.02203357
x2 0.01218894 1.00000000 0.02298695
x3 -0.02203357 0.02298695 1.00000000
> matc <- cornode(mat, target=corr, result=TRUE)
Spearman Rank Correlation Post Function
x1 x2 x3
x1 1.0000000 0.4515535 0.1739153
x2 0.4515535 1.0000000 0.1646381
x3 0.1739153 0.1646381 1.0000000
The rank correlations in matc
are now very close to the target correlations of corr
.
The idea with this is that you draw the samples separately from the distribution for each variable, and then use the Iman & Connover approach to make the samples (as close) to the target correlations as possible.
That is a FAQ. Here is one answer using a recommended package:
R> library(MASS)
R> example(mvrnorm)
mvrnrmR> Sigma <- matrix(c(10,3,3,2),2,2)
mvrnrmR> Sigma
[,1] [,2]
[1,] 10 3
[2,] 3 2
mvrnrmR> var(mvrnorm(n=1000, rep(0, 2), Sigma))
[,1] [,2]
[1,] 8.82287 2.63987
[2,] 2.63987 1.93637
mvrnrmR> var(mvrnorm(n=1000, rep(0, 2), Sigma, empirical = TRUE))
[,1] [,2]
[1,] 10 3
[2,] 3 2
R>
Switching between correlation and covariance is straightforward (hint: outer product of vector of standard deviations).
This question was not tagged as python, but based on your comment it looks like you might be looking for a Python solution as well. The most basic Python implementation of Iman Convover, that I can concoct looks like the following in Python (actually numpy):
def makeCorrelated( y, corMatrix ):
c = multivariate_normal(zeros(size( y, 0 ) ) , corMatrix, size( y, 1 ) )
key = argsort( argsort(c, axis=0), axis=0 ).T
out = map(take, map(sort, y), key)
out = array(out)
return out
where y
is an array of samples from the marginal distributions and corMatrix is a positive semi definite, symmetric correlation matrix. Given that this function uses multivariate_normal() for the c matrix, you can tell this uses an implied Gaussian Copula. To use different copula structures you'll need to use different drivers for the c matrix.
精彩评论