开发者

R: calculate variance for data$V1 for each different value in data$V2

开发者 https://www.devze.com 2023-03-31 02:34 出处:网络
I have data frame looking like this V1V2 ..1 ..2 ..1 ..3 etc. For each distinct V2 value i would like to calculate variance of data in V1. I have just started my adventure with R, any hints how to

I have data frame looking like this

V1   V2
..   1
..   2
..   1
..   3

etc.

For each distinct V2 value i would like to calculate variance of data in V1. I have just started my adventure with R, any hints how to do this? for my specific case i guess i can do manually something like

 var1 = var(data[data$V2==1, "V1"])
 var2 = ...

etc because I know all possible V2 values ( there are not many ), however I am curious what would be more generic solutions.开发者_开发知识库 Any ideas?


And the old standby, tapply:

dat <- data.frame(x = runif(50), y = rep(letters[1:5],each = 10))
tapply(dat$x,dat$y,FUN = var)

         a          b          c          d          e 
0.03907351 0.10197081 0.08036828 0.03075195 0.08289562 


Another solution using data.table. It is a lot faster, especially useful when you have large data sets.

require(data.table)
dat2 = data.table(dat)
ans  = dat2[,list(variance = var(V1)),'V2']


There are a few ways to do this, I prefer:

dat <- data.frame(V1 = rnorm(50), V2=rep(1:5,10))
dat

aggregate (V1~V2, data=dat, var) # The first argument tells it to group V1 based on the values in V2, the last argument simply tells it the function to apply.

> aggregate (V1~V2, data=dat, var)
  V2        V1
1  1 0.9139360
2  2 1.6222236
3  3 1.2429743
4  4 1.1889356
5  5 0.7000294

Also look into ddply, daply etc in the plyr package.


library(reshape)
ddply(data, .(V2), summarise, variance=var(V1))


Using dplyr you can do

library(dplyr)
data %>%
  group_by(V2) %>%
  summarize(var = var(V1))

Here we group by the unique values of V2 and find the variance of V1 for each group.

0

精彩评论

暂无评论...
验证码 换一张
取 消