开发者

R: Match dataframes and filter these

开发者 https://www.devze.com 2023-03-06 21:33 出处:网络
I\'ve got a dataframe like variable k. The column all_possible_names contains more identifiers for the ILMN code.

I've got a dataframe like variable k. The column all_possible_names contains more identifiers for the ILMN code. Now I want to search in the column all_possible_names for the identifiers available in the dataframe identifier.

z <- matrix(c(0,0,1,1,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,1,0,0,0,"RND1 | AB849382 | uc001aeu.1","WDR | AB361738 | uc001aif.1", "PLAC8 | AB271612 | uc001amd.1","TYBSA | AB859482","GRA | AB758392 | uc001aph.1","TAF | AB142353"), nrow=6,
dimnames=list(c("ILMN_1651838","ILMN_1652371","ILMN_1652464","ILMN_1652952","ILMN_1653026","ILMN_1653103"),c("A","B","C","D","all_possible_names")))
k<-as.data.frame(z)

search<-c("AB361738","RND1", "LIS")
identifier <- as.data.frame(search)

The result must be like this:

    search    Na开发者_运维技巧mes
1 AB361738    WDR | AB361738 | uc001aif.1
2     RND1    RND1 | AB849382 | uc001aeu.1
3      LIS    NA

After creating this dataframe, the final output can be created. the column names must only contain the naming starting with uc0.

The final result than will be:

    search    Names
1 AB361738    uc001aif.1
2     RND1    uc001aeu.1
3      LIS    NA

Can anyone help me with this?

Many Thanks, Lisanne


Probably not the best way, but a way:

firstStep<-lapply(srch, grep, k$all_possible_names, fixed=TRUE, value=TRUE)
res<-lapply(firstStep, function(subres){
        prts<-unlist(strsplit(subres, " | ", fixed=TRUE))
        prts[which(substr(prts, 1, 3)=="uc0")]
    })

This returns the result as a list, because you may not be sure that there is only one result per search string.

0

精彩评论

暂无评论...
验证码 换一张
取 消