Say I have a set of 10,000 images that I'd like to classify based on similarity. A number of people have recommended that comparing histograms is a cheap way to measure similarity. This thread, for example, recommends using 6 histograms for each comparison.开发者_开发百科
If I compare each image's histogram with all other images in the set, that's O(n^2) = 60,000*59,999/2 comparisons in all, which is very slow. How can I speed this up?
Hash the histogram in some way,make a sorted list of the hashes, find adjacent values that are similar (within some limit) then compare those histograms
However making the histograms is likely to be the slow step
精彩评论