开发者

Levels not present when handling 1,0,NA

开发者 https://www.devze.com 2023-01-04 00:21 出处:网络
Here I am with an other newbie question. I am importing a CSV file that looks like this: \"username\",\"interest\",\"has_card\"

Here I am with an other newbie question.

I am importing a CSV file that looks like this:

"username","interest","has_card"
"test01","not_interesting",1
"test02","maybe_interesting",0
"test03","not_interesting",0
"test04","maybe",1

mydata <- read.table(file("test.csv", encoding = "UTF-8"), header=TRUE, sep=",")

Then (maybe it sounds like a newbie stupid question) why I can get the levels for string based stuff, like this:

> levels(mydata$interest)
[1] "开发者_运维知识库maybe"             "maybe_interesting" "not_interesting"

But not for binary (integer) based stuff.

> levels(mydata$has_card)
NULL

What I am doing is a barplot for frequencies table, I basically need to rename the labels 0,1 to something like "No", "Yes" in the plot legend. But I can't do:

levels(mydata$has_card)[1] <- "Yes"
levels(mydata$has_card)[0] <- "No"

Like I would do it with "maybe" "maybe_interesting" "not_interesting"


The default behavior of read.table is to convert character variables (which are not converted to logical, numeric or complex) to factors, cf. as.is or stringsAsFactors in the help page:

R> class(mydata$has_card)
[1] "integer"
R> class(mydata$interest)
[1] "factor"
R> factor(mydata$has_card, labels=c("No", "Yes"))
[1] Yes No  No  Yes
Levels: No Yes


Numeric fields are not automatically converted to factors. You can need to explicitly convert them using factor.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号