Edit: actually aside from select distinct(which I haven't verified yet), the main performance bottle neck might be the network speed, when server and client are both on localhost, select all 2 milion records took 36 seconds, however, on a (supposedly high speed) network with client sitting on another box, the query is not yet done after 10 minutes.
This is supposedly 100mbps network but when I checked the client(java jdbc), it's receiving data at a rate of 3kb/second. The mysql server, however, is sending at a speed o开发者_JS百科f 100kb/sec(including other client connections though). Why is the java jdbc client receiving data at such a low rate?select distinct(indexed_column) from mytable
is very slow on mytable with only 1 million rows, the indexed_column is a non-unique index. is there a way to optimize it? an explain gives this:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | 1 | SIMPLE | mytable | range | NULL | my_index | 50 | NULL | 1759002 | Using index for group-by |
does type=range means it's not using index? is this why it's slow?
I would build a unique index on the table, on the column you want "DISTINCT" of...
Hence your looking for DISTINCT on a given column. If you build a UNIQUE INDEX on a column(or columns) you are looking for distinct combinations, the indexed pages will only hold a pointer to the first record that qualifies for such combination.
Ex: if you have
Category Count
1 587
2 321
3 172
4 229
5 837
Your UNIQUE INDEX on category will only have 5 records... in this case, even though over 2,000 entries across 5 categories, the DISTINCT CATEGORY count is 5, the index has 5, you're done. Apply this concept to your table of 1 MILLION + records.
精彩评论