Possible Duplicate:
Extracting indices for data frame rows that have MAX value for named field
hello,
I have a data frame like this :
A1 A3 d
1 a pr 5
2 a be 0
3 a cd 8
4 a dy 0
5 b pr 3
6 b be 4
7 b cd 9
etc...
I want to test each row, and get the unique rows based on A1 and have max value of d
the output should be like this
A1 A3 d
a cd 8
b cd 9
etc..
The data frame is bigger , but that's an example.
Can this be done with R? without looping and long stuff??
thanks
The easiest way to do it is to sort the d
column, and them remove duplicates in the A1
column:
df2 <- df[order(df$d,decreasing=T),]
df2[!duplicated(df2$A1),]
This does assume that there is a single unique maximum, you would lose some results if there were more than 1.
Probably
ddply(dfr, "A1", function(curdfr){curdfr[which.max(curdfr$d),]})
DATA
mydf <- read.table(textConnection("
Lp A1 A3 d
1 a pr 5
2 a be 0
3 a cd 8
4 a dy 0
5 b pr 3
6 b be 4
7 b cd 9"),header=T,row.names="Lp")
CODE
require(data.table)
mydf <- data.table(mydf)
mydf[,.SD[which.max(d)],by=A1]
精彩评论