How does SELECT DISTINCT work in MySQL?_问答_开发者

开发者 https://www.devze.com 2023-04-05 23:00 出处：网络

I have a table开发者_JAVA技巧 with multiple rows which have a same data.I used SELECT DISTINCT to get a unique row and it works fine.But when i use ORDER BY with SELECT DISTINCT it gives me unsorted d

相关专题：sql

I have a table开发者_JAVA技巧 with multiple rows which have a same data. I used SELECT DISTINCT to get a unique row and it works fine. But when i use ORDER BY with SELECT DISTINCT it gives me unsorted data.

Can anyone tell me how distinct works?

Based on what criteria it selects the row?

From your comment earlier, the query you are trying to run is

Select distinct id from table where id2 =12312 order by time desc.

As I expected, here is your problem. Your select column and order by column are different. Your output rows are ordered by time, but that order doesn't necessarily need to preserved in the id column. Here is an example.

id  | id2    | time
-------------------
1   | 12312  | 34
2   | 12312  | 12
3   | 12312  | 48

If you run

SELECT * FROM table WHERE id2=12312 ORDER BY time DESC

you will get the following result

id  | id2    | time
-------------------
2   | 12312  | 12
1   | 12312  | 34
3   | 12312  | 48

Now if you select only the id column from this, you will get

id
--
2
1
3

This is why your results are not sorted.

When you specify SELECT DISTINCT it will give you all the rows, eliminating duplicates from the result set. By "duplicates" I mean rows where all fields have the same values. For example, say you have a table that looks like:

 id  |  num
 --------------
 1   |  1 
 2   |  3
 3   |  3

SELECT DISTINCT * would return all rows above, whereas SELECT DISTINCT num would return two rows:

num
-----
1
3

Note that which row actual row (eg: whether it's row 2 or row 3) it selects is irrelevant, as the result would be indistinguishable.

Finally, DISTINCT should not affect how ORDER BY works.

Reference: MySQL SELECT statement

The behaviour you describe happens when you ORDER BY an expression that is not present in the SELECT clause. The SQL standard does not allow such a query but MySQL is less strict and allows it.

Let's try an example:

SELECT DISTINCT colum1, column2
FROM table1
WHERE ...
ORDER BY column3

Let's say the content of the table table1 is:

 id | column1 | column2 | column3
----+---------+---------+---------
  1 | A       | B       | 1
  2 | A       | B       | 5
  3 | X       | Y       | 3

Without the ORDER BY clause, the above query returns following two records (without ORDER BY the order is not guaranteed):

 column1 | column2
---------+---------
 A       | B
 X       | Y

But with ORDER BY column3 the order is also not guaranteed.

The DISTINCT clause operates on the values of the expressions present in the SELECT clause. If row #1 is processed first then (A, B) is placed in the result set and it is associated with row #1. Then, when row #2 is processed, the values of the SELECT expressions produce the record (A, B) that is already in the result set. Because of DISTINCT it is dropped. Row #3 produces (X, Y) that is also put in the result set. Then, the ORDER BY column3 clause makes the records be sorted in the result set as (A, B), (X, Y).

But if row #2 is processed before row #1 then, following the same logic exposed in the previous paragraph, the records in the result set are sorted as (X, Y), (A, B).

There is no rule imposed on the database engine about the order it processes the rows when it runs a query. The database is free to process the rows in any order it consider it's better for performance.

Your query is invalid SQL and the fact that it can return different results using the same input data proves it.