开发者

how to select rows from a dataframe1 in R where dataframe$1column is found somewhere in dataframe2$column

开发者 https://www.devze.com 2023-02-23 00:42 出处:网络
I need to create a new dataframe from rows from dataframe1, such that the value of dataframe1$column is a value found in dataframe2$colum

I need to create a new dataframe from rows from dataframe1, such that the value of dataframe1$column is a value found in dataframe2$colum

the dataframes are:

y <- "name,number,lunch  
joe,2,peaches  
steve,5,hotdog  
jon,7,clamroll  
nick,11,sloppyJoe"


x <- "number,office  
1,1b  
2,1a  
3,2s  
4,4d  
5,f4  
6,f4  
7,h3  
8,g3  
9,j7  
10,d3 
11,jk"  

df1 <- read.csv(textConnection(df1), header=TRUE, sep=",", stringsAsFactors=FALSE)  
df2 <- read.csv(textConnection(df2), header=TRUE, sep=",", stringsAsFactors=FAL开发者_如何学编程SE)

I have tried:

df3 <- df1[which(df1$number == df2$number), ]

to no avail. How do I properly do this in R? I could write a perl script, but I have about 100 of these sets and don't want to create more temp files.


again, the %in% trick :

> df1[df1$number %in% df2$number,]
   number office
2       2     1a
5       5     f4
7       7     h3
11     11     jk

For what it's worth, you can easily just do a merge if you want to combine them. In this case I'd say that's the cleanest solution : gives you every office of the occuring employees, and matches them :

> merge(df1,df2)
  number office  name     lunch
1      2     1a   joe   peaches
2      5     f4 steve    hotdog
3      7     h3   jon  clamroll
4     11     jk  nick sloppyJoe

Check the help files of merge for more options, you can do a whole lot with it.


Joris' answer is spot on. The merge() command can also be useful for this type of stuff. If you are familiar with SQL joins, you can draw parallels between most of the options in merge() and the different join operations.

#Inner join
> merge(df1,df2)
  number office  name      lunch
1      2   1a     joe  peaches  
2      5   f4   steve   hotdog  
3      7   h3     jon clamroll  
4     11     jk  nick  sloppyJoe

#Right join:
> merge(df1,df2, all.x = TRUE)
   number office  name      lunch
1       1   1b    <NA>       <NA>
2       2   1a     joe  peaches  
3       3   2s    <NA>       <NA>
4       4   4d    <NA>       <NA>
5       5   f4   steve   hotdog  
6       6   f4    <NA>       <NA>
7       7   h3     jon clamroll  
8       8   g3    <NA>       <NA>
9       9   j7    <NA>       <NA>
10     10    d3   <NA>       <NA>
11     11     jk  nick  sloppyJoe
0

精彩评论

暂无评论...
验证码 换一张
取 消