This is a tough one. There is probably a name for this and I don't know it, so I'll describe the problem exactly.
I have a dataset including a number of user-submitted values. I need to be able to determine based on some sort of average, or better, a "closeness of data", which value is the correct value. For example, if I received the following three submissions from three users, 4, 10, 3, I would know that 3 or 4 would be the "correct" value in this case. If I were to average it out, I'd get 5.6 which is not the intended result.
I'm attempting to do this using MySQL and PHP.
tl;dr Need to find a value from a dataset based on "clo开发者_JS百科seness" of relative values (using MySQL/PHP)
Thanks!
Clustering using a database isn't going to be a single query type of procedure. It takes iterations to generate the clusters effectively.
You first need to decide how many clusters you want. If you wanted only one cluster, then obviously everything would go into it. If you want two, then you can write your program to separate the nodes into two groups using some sort of correlation metric.
In other words, I don't think this is a MySQL question so much as a clustering question.
I think that is the kind of thing you're looking for:
SELECT id, MIN(ABS(id - (SELECT AVG(id) FROM table))) as min
FROM table
GROUP BY id
ORDER BY min
LIMIT 1;
Per example, if your data set contains the following IDs: 3, 4, 10, with an average of 5.6667. The closest value to 5.6667 is 4. If your data set is 3, 6, 10, 14, with an average of 8.25, the clostest value is 10.
This is what this query returns. Hope it helps.
I have the impression you are looking for the median
E.g. in the list 1 2 3 4 100, the median (central value) is 3.
You may want to search for [https://stackoverflow.com/search?q=sql+median finding the median in SQL].
精彩评论