I searched a lot 开发者_运维问答of questions about heatmap throughout the site and packages, but I still have a problem.
I have clustered data (kmeans/EM/DBscan..), and I want to create a heatmap by grouping the same cluster. I want the similar color patterns to be grouped in the heatmap, so generally, it looks like a block-diagonal. I tried to order the data by the cluster number and display it,k = kmeans(data, 3)
d = data.frame(data)
d = data.frame(d, k$cluster)
d = d[order(d$k.cluster),]
heatmap(as.matrix(d))
but it is still not sorted and looks like this link: But, I want it to be sorted by its cluster number and looked like this:
Can I do this in R?
I searched lots of packages and tried many ways, but I still have a problem.
Thanks a lot.You can do this using reshape2
and ggplot2
as follows:
library(reshape2)
library(ggplot2)
# Create dummy data
set.seed(123)
df <- data.frame(
a = sample(1:5, 1000, replace=TRUE),
b = sample(1:5, 1000, replace=TRUE),
c = sample(1:5, 1000, replace=TRUE)
)
# Perform clustering
k <- kmeans(df, 3)
# Append id and cluster
dfc <- cbind(df, id=seq(nrow(df)), cluster=k$cluster)
# Add idsort, the id number ordered by cluster
dfc$idsort <- dfc$id[order(dfc$cluster)]
dfc$idsort <- order(dfc$idsort)
# use reshape2::melt to create data.frame in long format
dfm <- melt(dfc, id.vars=c("id", "idsort"))
ggplot(dfm, aes(x=variable, y=idsort)) + geom_tile(aes(fill=value))
You should set Rowv
and Colv
to NA
if you don't want the dendrograms and the subseuent ordering. BTW, You should also put of the scaling. Using the df of Andrie :
heatmap(as.matrix(df)[order(k$cluster),],Rowv=NA,Colv=NA,scale="none",labRow=NA)
In fact, this whole heatmap is based on image()
. You can hack away using image
to construct a plot exactly like you want. Heatmap is using layout() internally, so it will be diffucult to set the margins. With image you could do eg :
myHeatmap <- function(x,ord,xlab="",ylab="",main="My Heatmap",
col=heat.colors(5), ...){
op <- par(mar=c(3,0,2,0)+0.1)
on.exit(par(op))
nc <- NCOL(x)
nr <- NROW(x)
labCol <- names(x)
x <- t(x[ord,])
image(1L:nc, 1L:nr, x, xlim = 0.5 + c(0, nc), ylim = 0.5 +
c(0, nr), axes = FALSE, xlab=xlab, ylab=ylab, main=main,
col=col,...)
axis(1, 1L:nc, labels = labCol, las = 2, line = -0.5, tick = 0)
axis(2, 1L:nr, labels = NA, las = 2, line = -0.5, tick = 0)
}
library(RColorBrewer)
myHeatmap(df,order(k$cluster),col=brewer.pal(5,"BuGn"))
To produce a plot that has less margins on the side. You can also manipulate axes, colors, ... You should definitely take a look at the RColorBrewer
package
(This custom function is based on the internal plotting used by heatmap btw, simplified for the illustration and to get rid of all the dendrogram stuff)
精彩评论