开发者

correlation failure - Pearson

开发者 https://www.devze.com 2023-03-31 02:18 出处:网络
I want to write to datafile information about correlation as follows: *korelacja=cor(p2,d2,method=\"pearson\",use = \"complete.obs\")

I want to write to datafile information about correlation as follows:

*korelacja=cor(p2,d2,method="pearson",use = "complete.obs")
korelacja2=cor(p2,d2,method="kendall",use = "complete.obs")
korelacja3=cor(p2,d2,method="spearman",use = "complete.obs")
dane=paste(korelacja,korela开发者_JS百科cja2,korelacja3,sep=';')
write(dane,file=nazwa,append=TRUE)*

Results are strange for me - Pearson correlation is very high (always equal one), but Kendall and Spearman is very low. I create scatterplots and I don't see linear correlation.


It's not hard to replicate this pattern if you have some large outliers in your data that dominate the Pearson correlation but are relatively insignificant in the non-parametric (Kendall/Spearman) approaches. For example, here's a concocted data set with nothing going on except for one large outlier:

> set.seed(1001)
> x <- c(runif(1000),1e5)
> y <- c(runif(1000),1e5)
> cor(x,y,method="pearson")
[1] 1
> cor(x,y,method="kendall")
[1] -0.02216583
> cor(x,y,method="spearman")
[1] -0.03335352

This is consistent with your description so far, although you ought in this case to be able to see the outliers in your scatterplots ...

0

精彩评论

暂无评论...
验证码 换一张
取 消