开发者

R dataframe filtering

开发者 https://www.devze.com 2023-02-23 12:05 出处:网络
I have a dataframe df as follows: ABC NA 12 2NA 3 45 6 78 9 what I want to do is remove all the rows that has NA.

I have a dataframe df as follows:

 A  B  C
 NA 1  2
 2  NA 3
 4   5 6
 7   8 9

what I want to do is remove all the rows that has NA.

if I use

 apply(df,1,function(row) all(!is.na(row)))

I get the list of all the rows with TRUE (if the row does not contain a NA) and FALSE(if the row contains a NA). But how do I get the rowname such that I can create some li开发者_StackOverflow中文版ke

df2<-df[-c(list of rows that contains NA),] 

which will give me all the new dataframe with NA in rows.

Thanks in advance.


Assuming you have a dataframe that looks like this:

   A  B C
1 NA  1 2
2  2 NA 3
3  4  5 6
4  7  8 9

Then try:

df1[apply(df1,1,function(x) !any(is.na(x))), ]
  A B C
3 4 5 6
4 7 8 9

It doesn't use rownames but rather a logical vector. I guess Joshua and I read you question differently but we used the same method.

Joshua's suggestion is more compact:

> na.omit(df1)
  A B C
3 4 5 6
4 7 8 9

And it reminds me that I should have used:

> df1[complete.cases(df1), ]
  A B C
3 4 5 6
4 7 8 9


You can use the logical vector from your apply call to index your data.frame.

> Data[!apply(Data,1,function(row) all(!is.na(row))),]
   A  B C
1 NA  1 2
2  2 NA 3
> # or like this:
> Data[apply(Data,1,function(row) any(is.na(row))),]
   A  B C
1 NA  1 2
2  2 NA 3


is.na on a data.frame returns a matrix, which is a better candidate for apply:

df <- read.table(textConnection(" A  B  C
NA 1  2
2  NA 3
4   5 6
7   8 9
"))

## a matrix
is.na(df)

## logical for selecting rows that are all NA
apply(df, 1, function(x) all(is.na(x)))

##  one liner
df[!apply(df, 1, function(x) all(is.na(x))), ]
0

精彩评论

暂无评论...
验证码 换一张
取 消