I have
Person,Messages
Dave,8
James,6
Dave,6
Dave,8
Dave,8
John,5
John,5
John,20
Dave,0
....
I want to create a heatmap where message density of each message is shown for all players. I want to limit it to 0-14 message values on the x-axis (in other words, I care that John has 20 and it should affect the overall density, but I don't care to see 20 listed on the x-axis, because it doesn't开发者_高级运维 happen that often). Player names are on the y-axis. How do I do this? Please let me know if this does not make sense.
If I'm understanding you correctly, you may not have to transform your data to a matrix at all, if you're willing to use geom_tile
from ggplot2
:
dat <- read.table(textConnection("Person,Messages
Dave,8
James,6
Dave,6
Dave,8
Dave,8
John,5
John,5
John,20
Dave,0"),sep = ",",header = TRUE)
dat <- ddply(dat,.(Person,Messages),summarise,val = length(Person))
ggplot(dat,aes(x = Messages, y = Person, fill = val)) +
geom_tile()
Or here's a somewhat laborious route to a full matrix that you could use as input in image
assuming that we're starting with the original data in dat
:
#Some data to pad with the missing combinations
pad <- expand.grid(unique(dat$Person),
min(dat$Messages):max(dat$Messages))
colnames(pad) <- c('Person','Messages')
#Aggregate the data and merge with pad data
dat <- ddply(dat,.(Person,Messages),summarise,val = length(Person))
tmp <- merge(dat,pad,all.y = TRUE)
#Convert from long to wide
rs <- cast(tmp,Person~Messages,value = 'val')
#Clean up the result
rownames(rs) <- rs$Person
rs <- rs[,-1]
rs[is.na(rs)] <- 0
> rs
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Dave 1 0 0 0 0 0 1 0 3 0 0 0 0 0 0 0 0 0 0 0 0
James 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
John 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
精彩评论