l开发者_运维知识库ibrary(amap)
set.seed(5)
Kmeans(mydata, 5, iter.max=500, nstart=1, method="euclidean")
in 'amap' package and run several times, but even though the parameters and seed value are always the same, the clustering results are different every time I run Kmeans, or other cluster methods.
I tried another kmeans function in different packages, but still the same...
In fact, I want to use the Weka and R together, so I also tried SimpleKMeans
in RWeka package, and this gives always the same value. However, the problem is that I do not know how to store the clustered data along with the cluster number from SimpleKmeans in RWeka so I'm stuck...
Anyhow, how can I keep the clustering result always the same? or How can I store the clustering result from SimpleKmeans
into R?
You must be doing something wrong. I get reproducible results each time I run the following code, as long as I set the seed before each call to Kmeans()
:
library(amap)
out <- vector(mode = "list", length = 10)
for(i in seq_along(out)) {
set.seed(1)
out[[i]] <- Kmeans(iris[, -5], 3, iter.max=500, nstart=1, method="euclidean")
}
for(i in seq_along(out[-1])) {
print(all.equal(out[[i]], out[[i+1]]))
}
The last for loop prints:
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
Indicating the results are exactly the same each time.
Just a reminder that K-mean results are sensitive to the order of the data points in the data set. If you run again the proper code with randomized data points you will get a different result
Have you set the seed?
set.seed(1)
Everytime K-Means initializes the centroid, it is generated randomly, which is needing seed for generating random values.
精彩评论