开发者

Removing rows in dataset goes wrong

开发者 https://www.devze.com 2023-03-08 17:08 出处:网络
I have the following dataset: text <- c(1:13) numbers <- c(1,1,1,1,1,1,1,1,1,1,1,1,1) test <- data.frame(

I have the following dataset:

text <- c(1:13)
numbers <- c(1,1,1,1,1,1,1,1,1,1,1,1,1)
test <- data.frame(
    text =text,
    is.numeric.feature = numbers)

   text is.numeric.feature
1     1                  1
2     2                  1
...
13    13                 1

Now I want to remove all rows where the numeric feature == 0 (there are none here, but in other datasets开发者_JAVA百科 there are) When I use the following command, my complete dataset is empty, what did I do wrong?

test[-c(which(test$is.numeric.feature==0)),]


The reason is that which(data$is.numeric.feature==0) returns integer(0) when there are no zeros.

> Data[-integer(0),]
[1] text               is.numeric.feature
<0 rows> (or 0-length row.names)

To overcome this, better work with logical vectors :

Data[Data$is.numeric.feature!=0,]

On a sidenote, the c() in your oneliner is redundant. which returns a vector anyway. And please, never ever give your dataframe or vectors a name that's also the name of a function. You will run into trouble at one point.


Here's another way of doing this.

data[!data$is.numeric.feature == 0, ]


It goes wrong because the which statement returns integer(0), an empty integer vector. Indexing -numeric(0) is not interpreted as "dont omit anything" but as indexing integer(0) which means "index nothing". I think it should go right if there is at least one zero in your data.

But you don't need which anyway and the logical vector is fine. These both work:

data[data$is.numeric.feature!=0,]

subset(data,is.numeric.feature!=0)
0

精彩评论

暂无评论...
验证码 换一张
取 消