I kind of have the same problem. I have data in this kind of order: ;=column
D1 ;hurs
1 ;0.12开发者_运维百科
1 ;0.23
1 ;0.34
1 ;0.01
2 ;0.24
2 ;0.67
2 ;0.78
2 ;0.98
and I like to have it like this:
D1; X; X; X; X
1;0.12; 0.23; 0.34; 0.01;
2;0.24; 0.67; 0.78; 0.98;
I would like to sort it with respect to D1 and like to reshape it? Does anyone have an idea? I need to do this for 7603 values of D1.
I would look into Hadley's reshape
package. It does all sorts of great stuff. The code below will work with your toy example, but there is probably a more elegant way of doing this. Simply, your data already appear to be in the ?melt
form, so you can simply ?cast
it.
Also, check out these links
http://www.statmethods.net/management/reshape.html
http://had.co.nz/reshape/
library(reshape)
help(package=reshape)
?melt
D1 <- c(1,1,1,1,2,2,2,2)
hurs <- c(.12, .23, .34, .01, .24, .67, .78, .98)
var <- rep(paste("X", 1:4, sep=""), 2)
foo <- data.frame(D1, var, hurs)
foo
cast(foo, D1~var)
Digging up skeletons not likely to ever be claimed, why not use aggregate()
?
dat = read.table(header = TRUE, sep = ";", text = "D1 ;hurs
1 ;0.12
1 ;0.23
1 ;0.34
1 ;0.01
2 ;0.24
2 ;0.67
2 ;0.78
2 ;0.98")
aggregate(hurs ~ D1, dat, c)
# D1 hurs.1 hurs.2 hurs.3 hurs.4
# 1 1 0.12 0.23 0.34 0.01
# 2 2 0.24 0.67 0.78 0.98
If the lengths of each id in D1 are not the same, you can also use base R reshape()
after first creating a "time" variable:
dat2 <- dat[-8, ]
dat2$timeSeq <- ave(dat2$D1, dat2$D1, FUN = seq_along)
reshape(dat2, direction="wide", idvar="D1", timevar="timeSeq")
# D1 hurs.1 hurs.2 hurs.3 hurs.4
# 1 1 0.12 0.23 0.34 0.01
# 5 2 0.24 0.67 0.78 NA
I have assumed that there are unequal number of hurs per D1 (7603 values)
txt = 'D1 ;hurs
1 ;0.12
1 ;0.23
1 ;0.34
1 ;0.01
2 ;0.24
2 ;0.67
2 ;0.78
2 ;0.98'
dat <- read.table(textConnection(txt),header=T,sep=";")
dat$Lp <- 1:nrow(dat)
dat <- dat[order(dat$D1,dat$Lp),]
out <- split(dat$hurs,dat$D1)
out <- sapply(names(out),function(x) paste(paste(c(x,out[[x]]),collapse=";"),";",sep="",collapse=""))
reshape2 is actually better than reshape. Using reshape uses significantly more memory and time than reshape2 (at least for my specific example using something like 9million rows).
You might check Hadley Wickham's reshape package and its cast() function
http://had.co.nz/reshape/
精彩评论