I need to randomly select, in an efficient way, 10 rows from my table.
I found out that the following works nicely (after the query, I just select 10 random e开发者_高级运维lements in PHP from the 10 to 30 I get from the query):
SELECT * FROM product WHERE RAND() <= (SELECT 20 / COUNT(*) FROM product)
However, the subquery, though relatively cheap, is computed for every row in the table. How can I prevent that? With a variable? A join?
Thanks!
A variable would do it. Something like this:
SELECT @myvar := (SELECT 20 / COUNT(*) FROM product);
SELECT * FROM product WHERE RAND() <= @myvar;
Or, from the MySql math functions doc:
You cannot use a column with RAND() values in an ORDER BY clause, because ORDER BY would evaluate the column multiple times. However, you can retrieve rows in random order like this:
mysql> SELECT * FROM tbl_name ORDER BY
> RAND();
ORDER BY RAND() combined with LIMIT is useful for selecting a random sample from a set of rows:
mysql> SELECT * FROM table1, table2
> WHERE a=b AND c<d -> ORDER BY RAND()
> LIMIT 1000;
RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.
Its a highly mysql specific trick but by wrapping it in another subquery MySQL will make it a constant table and compute it only once.
SELECT * FROM product WHERE RAND() <= ( select * from ( SELECT 20 / COUNT(*) FROM product ) as const_table )
SELECT * FROM product ORDER BY RAND() LIMIT 10
Don't use order by rand(). This will result in a table scan. If you have much data at all in your table this will not be efficient at all. First determine how many rows are in the table:
select count(*) from table
might work for you, though you should probably cache this value for some time since it can be slow for large datasets.
explain select * from table
will give you the db statistics for the table (how many rows the statistics thinks are in the table) This is much faster, however it is less accurate and less accurate still for InnoDB.
once you have the number of rows, you should write some code like:
pseudo code:
String SQL = "SELECT * FROM product WHERE id IN (";
for (int i=0;i<numResults;i++) {
SQL += (int)(Math.rand() * tableRows) + ", ";
}
// trim off last ","
SQL.trim(",");
SQL += ")";
this will give you fast lookup on PK and avoid the table scan.
精彩评论