I am just starting to get beyond the basics in R and have come to a point where I need some help. I want to restructure some data. Here is what a sample dataframe may look like:
ID Sex Res Contact
1 M MA ABR
1 M MA CON
1 M MA WWF
2 F FL WIT
2 F FL CON
3 X GA XYZ
I want the data to look like:
ID SEX Res ABR CON WWF WIT XYZ
1 M MA 1 1 1 0 0
2 F FL 0 1 0 1 0
3 X GA 0 0 0 0 1
What are my options? How would I do this in R?
In short, I am looking to keep the values of the CONT column and use them as column names in the restructred data frame. I want to hold a variable set of columns constant (in th example above, I held ID, Sex, and Res constant).
Also, is it possible to control the values in the restructured data? I may want to keep the data as binary. I may want some data to have the value be the count of times each contact va开发者_Go百科lue exists for each ID.
The reshape
package is what you want. Documentation here: http://had.co.nz/reshape/. Not to toot my own horn, but I've also written up some notes on reshape
's use here: http://www.ling.upenn.edu/~joseff/rstudy/summer2010_reshape.html
For your purpose, this code should work
library(reshape)
data$value <- 1
cast(data, ID + Sex + Res ~ Contact, fun = "length")
model.matrix
works great (this was asked recently, and gappy had this good answer):
> model.matrix(~ factor(d$Contact) -1)
factor(d$Contact)ABR factor(d$Contact)CON factor(d$Contact)WIT factor(d$Contact)WWF factor(d$Contact)XYZ
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 0 1 0
4 0 0 1 0 0
5 0 1 0 0 0
6 0 0 0 0 1
attr(,"assign")
[1] 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$`factor(d$Contact)`
[1] "contr.treatment"
精彩评论