I'd li开发者_如何学编程ke to sort a categorical variable my own way. I have grouped my dataset into categories like "1-5","6-10","11-20" .... ">251" and so forth. If plot the variables or display them in a table the sequence of the legend respectively the sequence in the label is "messed up".
This is not surprising since R does not know that these unordered variables are ordered in fact. Is there a way to attach a manually defined sequence to them?
thx for any suggestions in advance!
Categorical variables are stored as (or converted to be) factor
s when you plot them. The order they appear in the plot depends upon the levels of the factor.
You likely want to use cut
to create your groups. e.g.
dfr <- data.frame(x = runif(100, 1, 256))
dfr$groups <- cut(dfr$x, seq(1, 256, 5))
This problem is also very similar to another recent SO question.
When I want to specify a different order for a factor manually (tedious, but sometimes necessary) here is what I do:
> ## a factor
> x <- factor(letters[1:3])
> ## write out levels with dput
> dput(levels(x))
c("a", "b", "c")
> ## copy, paste, modify and use factor again. e.g.
> x <- factor(x, levels=c("b", "a", "c"))
> x
[1] a b c
Levels: b a c
I like using split
for that sort of thing.
vect = runif(10)
vect.categories = c(rep(LETTERS[1],5),rep(LETTERS[2],3),rep(LETTERS[5],2))
category.list =split(vect,vect.categories)
....
May not be related, but thought I'd offer the suggestion.
精彩评论