I have a CSV with a long list of data that looks like this:
Date user_id value
4/1 1 5
4/1 2 3
4/1 3 10
4/2 1 1
4/2 3 7
and I want to move it into a data frame开发者_JAVA百科 that just has one column of user id's and a column for each date. I'm assuming there is a way to do with with sapply
or lapply
but I'm not sure how to handle that the user id's don't always exist for every date.
Maybe something using reshape
like the following, which assumes your data are stored in dat
:
reshape(dat,v.names = "value",idvar = "user_id",
direction = "wide",timevar = "Date")
user_id value.4/1 value.4/2
1 1 5 1
2 2 3 NA
3 3 10 7
Or perhaps more simply, use dcast
from reshape2
:
dcast(dat,user_id~Date)
user_id 4/1 4/2
1 1 5 1
2 2 3 NA
3 3 10 7
This is also something that tidyr::spread
does quite conveniently
require(tidyr)
require(dplyr)
df <- data.frame("Date" = rep(c("Nov", "Dec"), each = 3),
"user.id" = rep(1:3, 2),
"value" = rnorm(6))
df.2 <- df %>%
spread(Date, value)
df.2
user.id Dec Nov
1 -1.9094765 -1.101037
2 0.2358694 -1.418151
3 -0.4297790 -1.426573
精彩评论