开发者

How do i select a value from a column, based on a selection criteria on another column?

开发者 https://www.devze.com 2023-04-06 20:27 出处:网络
How to select a value/name from a column, based on frequency(no. of repetitions) of a value in another column..in R?

How to select a value/name from a column, based on frequency(no. of repetitions) of a value in another column..in R?

This is what my sample data is..:

 Col1   col2  Col3  Col4 
 Yuva    123  qwe   XXYY
 Arun    234  asd   XYXY
 Yuva    870  ghj   XXYX
 Naan    890  qwe   XYYX
 Shan    231  asd   YXYX
 Yuva    453  qwe   YYXY
 Naan    314  ghj   YXYY
 Yuva    908  bnm   YYYx

Now i would like to know the function/statement which gives me pair of value from col1 and Col3 based on the number of times the value in col3 has occurred. i.e., what is the corresponding value in col1, when 'qwe' has occurred once(/twice/thrice). the expected(required) answer is:

                  Upon giving   I should get
                 --------------------------- 
                    qwe =1      Naan
                    qwe =2      Y开发者_开发知识库uva 
                    qwe=3       ------(not available). similarly
                    asd=1       Arun
                                Shan
                    ghj=1       Yuva
                                Naan
                    ghj = 2      -----(not available)
           and for  bnm=1       Yuva.

Please help me guys.


The xtabs function returns a contingency table which supports matrix indexing:

getCombs <- function(nam , cnt) names( 
                   which(xtabs( ~Col1+Col3, data=dat)[ ,nam] == cnt)
                                    )

> getCombs("ghj", 1)
[1] "Naan" "Yuva"
> getCombs("ghj", 3)
character(0)

If you need to have "not available" as the value, then just test the result for length()==0 and return that string if so.


Maybe something like this:

x <- read.table(textConnection("Col1\tcol2\tCol3\tCol4\nYuva\t123\tqwe\tXXYY\nArun\t234\tasd\tXYXY\nYuva\t870\tghj\tXXYX\nNaan\t890\tqwe\tXYYX\nShan\t231\tasd\tYXYX\nYuva\t453\tqwe\tYYXY\nNaan\t314\tghj\tYXYY\nYuva\t908\tbnm\tYYYx\n"), header=TRUE, sep="\t", stringsAsFactors=FALSE)

selCol1 <- function(x, valCol3, occur) {
    s <- subset(x, Col3==valCol3)
    f <- as.factor(s$Col1)
    t <- table(f)
    idx <- which(t==occur)
    if(length(idx)==0)
        return(NA)
    else
        return(levels(f)[idx])
}

selCol1(x,"qwe",1)
selCol1(x,"qwe",2)
selCol1(x,"qwe",3)
selCol1(x,"ghj",1)
selCol1(x,"ghj",2)
selCol1(x,"bnm",1)
0

精彩评论

暂无评论...
验证码 换一张
取 消