How to calculate reconstruction error and where can I find information about it? (I will calculate r开发者_运维百科econstruction error of my data after K-means algorithm)
Needed to calculate every points distance to center points at each cluster.
One way to calculate the reconstruction error from a given vector is to compute the euclidean distance between it and its representation. In K-means, each vector is represented by its nearest center.
So after running k means: For each vector, calculate the error for the vector as the euclidean distance between that vector and its centroid. Sum them up the errors for every vector, and you have the error on your training set. Lower errors will tend to give better clusterings overall.
Indeed, the K-Means algorithm is itself tries to optimize this very metric, and if you let it run to convergence, it will find a local minimum on for the euclidean distance reconstruction error.
精彩评论