开发者

How to merge on a rowname by some column from another data.frame?

开发者 https://www.devze.com 2023-01-22 04:40 出处:网络
I wonder if there is a better way to put two data.frames into one, treating the rownames as if it was a column and then merge by this column with some other data.frame. I know I could do the following

I wonder if there is a better way to put two data.frames into one, treating the rownames as if it was a column and then merge by this column with some other data.frame. I know I could do the following

 df1$rn <- row(df1) 
 all <- merge(df1,df2, by.x="rn", by.y="some_column")

I mean this produces redu开发者_高级运维ndant data (rownames as column) which is not needed at all. So what´s the smarter way to do it?


You can use "row.names" or 0 as the index for row names.

An example using the authors and books from merge help:

rownames(authors) <- authors$surname
merge(authors, books, by.x = "row.names", by.y = "name")


"A smarter way" really depends on your data, which we don't have. but

df1 <- data.frame(
    X1 = 1:10,
    id = letters[1:10]
)

df2 <- data.frame(
    X2 = 10:1,
    X3 = letters[11:20]
)
rownames(df2) <- df1$id
df2 <- df2[sample.int(10),]

cbind(df1,df2[match(df1$id,rownames(df2)),])

Edit: Vitoshka's answer is the one you're looking for. If I'd have bothered looking at the help files of ?merge, I would have known that as well...

I leave my solution here just in case somebody needs a speedy alternative to merge:

> system.time(replicate(1000,cbind(df1,df2[match(df1$id,rownames(df2)),])))
   user  system elapsed 
   0.57    0.00    0.57 
> system.time(replicate(1000,merge(df1,df2,by.x="id",by.y="row.names")))
   user  system elapsed 
   2.36    0.02    2.37 
0

精彩评论

暂无评论...
验证码 换一张
取 消