R question: How do I stack two or more columns of numbers and keep a factor as well
I have a data.frame
like this:
patient analyte1value analyte2value analyte3value
pt1 1 3 5
pt2 2 6 7
pt3 9 10 2
...
I know I can use stack(select=c(colu开发者_开发技巧mnnames))
, but I lose the patient factor.
I want to get out:
pt1 1 analyte1
pt1 3 analyte2
pt1 5 analyte3
pt2 2 analyte1
pt2 6 analyte2
...
(I have a sneaking suspicion that I need plyr or something like that...)
thanks.
One option is one of Hadley's other packages: reshape2
:
> require(reshape2)
> dat
patient analyte1 analyte2 analyte3
1 pt1 1 3 5
2 pt2 2 6 7
3 pt3 9 10 2
> melt(dat, id = "patient")
patient variable value
1 pt1 analyte1 1
2 pt2 analyte1 2
3 pt3 analyte1 9
4 pt1 analyte2 3
5 pt2 analyte2 6
6 pt3 analyte2 10
7 pt1 analyte3 5
8 pt2 analyte3 7
9 pt3 analyte3 2
> str(melt(dat, id = "patient"))
'data.frame': 9 obs. of 3 variables:
$ patient : Factor w/ 3 levels "pt1","pt2","pt3": 1 2 3 1 2 3 1 2 3
$ variable: Factor w/ 3 levels "analyte1","analyte2",..: 1 1 1 2 2 2 3 3 3
$ value : int 1 2 9 3 6 10 5 7 2
One can do this in a more long-winded fashion using reshape()
from base R:
reshape(dat, direction = "long", sep = "", varying = 2:4,
times = names(dat)[2:4], idvar = "patient",
timevar = "variable", v.names = "value")
with the main difference being that variable
isn't a factor with base reshape()
. I presume the user-unfriendliness of that was a motivation for writing reshape2
...
If I understand correctly, you want to reshape
your dataframe to a long format.
reshape(df,varying=list(2:4),times=names(df)[2:4],
idvar="patient",v.names="value",direction="long")
精彩评论