开发者

How can I count the number of times a value occurs in a column of a dataframe?

开发者 https://www.devze.com 2023-01-05 23:53 出处:网络
Is there a simple way of identifying the number of times a value is in a vector or column of dataframe? I essentially want the numerical values of a histogram but I do not know how to access it.

Is there a simple way of identifying the number of times a value is in a vector or column of dataframe? I essentially want the numerical values of a histogram but I do not know how to access it.

开发者_运维知识库# sample vector
a <- c(1,2,1,1,1,3,1,2,3,3)

#hist
hist(a)

Thank you.

UPDATE:

On Dirk's suggestion I am using hist. Is there a better way than than specifying the range as 1.9, 2.9 etc when I know that all my values are integers?

 hist(a, breaks=c(1,1.9,2.9,3.9,4.9,5.9,6.9,7.9,8.9,9.9), plot=FALSE)$counts


Use table function.


Try this:

R> a <- c(1,2,1,1,1,3,1,2,3,3)
R> b <- hist(a, plot=FALSE)
R> str(b)
List of 7
 $ breaks     : num [1:5] 1 1.5 2 2.5 3
 $ counts     : int [1:4] 5 2 0 3
 $ intensities: num [1:4] 1 0.4 0 0.6
 $ density    : num [1:4] 1 0.4 0 0.6
 $ mids       : num [1:4] 1.25 1.75 2.25 2.75
 $ xname      : chr "a"
 $ equidist   : logi TRUE
 - attr(*, "class")= chr "histogram"
R> 

R is object-oriented and most methods give you meaningful results back. Use them.


If you want to use hist you don't need to specify the breaks as you did, just use the seq function

br <- seq(0.9, 9.9, 1)
num <- hist(a, br, plot=F)$counts

Also, if you're looking for a specific value you can also use which.

For instance:

num <- length(which(a == 1))


In addition to the performance difference between hist and table in the case of many unique values that Dirk and mbq already pointed out, I would also like to mention an other difference in functionality.

hist$counts will also give you zero counts for the bins that do not have any cases. This can be very valuable in the case where you want to be confident about the number of bins (bars on a barplot for example) that will end up in a following plot.

table on the other hand will only give you counts for existing values.

You might also want to check the right option of hist that controls whether your breaks (intervals) will be right closed or not.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号