开发者

On using plyr and ldply

开发者 https://www.devze.com 2023-04-13 06:17 出处:网络
I have a reoccuring problem - I apologize! Say I want to have the baseball data (from the plyr package) listed according to \'id\' and \'year\'. There is a difference between creating the list accord

I have a reoccuring problem - I apologize!

Say I want to have the baseball data (from the plyr package) listed according to 'id' and 'year'. There is a difference between creating the list according to either:

1. mylist1 <- dlply(baseball, .(id, year), identity)

and

2. mylist2 <- dlply(baseball, .(id), dlply, .(year), identity)

in the way the list is开发者_开发百科 organized, but getting the list back into a data frame is working fine with 'mylist1'.

mydf1 <- ldply(mylist1)

but not with 'mylist2'

mydf2 <- ldply(mylist2)

which gives the following error message:

Error in list_to_dataframe(res, attr(.data, "split_label")): Result must be all atomic, or all data frames

I am a newbie to R, and this error message doesn't make much sense to me.

I would like to split my own data frame according to method 2, since I need quite a bit of data manipulation. My question is: how can I merge this list into a data frame? Is there an alternative to do.call(rbind, do.call(rbind,...?

I am greatful for any help!


I agree with @Andrie that this is an odd structure. But I assume that you have a particular reason for doing it this way.

Since it took two passes with dlply to create mylist2, it takes two invocations of ldply to put it back together.

mydf2 <- ldply(mylist2, ldply)

This restores baseball (modulo ordering)

> class(mydf2)
[1] "data.frame"
> all(dim(mydf2) == dim(baseball))
[1] TRUE
0

精彩评论

暂无评论...
验证码 换一张
取 消