开发者

MySQL multicolumn index

开发者 https://www.devze.com 2023-03-07 15:14 出处:网络
Should I include col3 & col4 in my index on MyTable if this is the only query I intend to run on my database?

Should I include col3 & col4 in my index on MyTable if this is the only query I intend to run on my database?

Select MyTable.col3, MyTable.col4
From MyTable 
Inner Join MyOtherTable
On MyTable.col1 = MyOtherTable.col1
And MyTable.col2 = MyOtherTable.col2;

The tables I'm using have about half a million rows in them. For the purposes of my question, col1 & col2 are a unique set found in both tables.

Here's the example table definition if you really need to know:

CREATE TABLE MyTable 
(col1 varchar(10), col2 varcha开发者_Python百科r(10), col3 varchar(10), col4 varchar(10));

CREATE TABLE MyOtherTable 
(col1 varchar(10), col2 varchar(10));

So, should it be this?

   CREATE MyIdx ON MyTable (col1,col2);

Or this?

   CREATE MyIdx ON MyTable (col1,col2,col3,col4);


adding columns col3 and col4 will not help because you're just pulling those values after finding them using columns col1 and col2. The speed would normally come from making sure columns col1 and col2 are indexed.

You should actually split those indexes since you're not using them together:

CREATE MyIdx ON MyTable (col1); CREATE MyIdx ON MyTable (col2);

I don't think a combined index will help you in this case.

CORRECTION: I think I've misspoken, since you intend to use only that query on the two tables and never have the individual columns joined in isolation. In your case it appears you could get some speed up by putting them together. It would be interesting to benchmark this to see just how much of a speedup you'd see on 1/2 million rows using a combined index versus individual ones. (You should still not use columns col3 and col4 in the index, since you're not joining anything by them.)


A query returning half a million rows joined from two tables is never going to be very fast - because it's returning half a million rows.

An index on col1,col2 seems sufficient (as a secondary index), but depending on what other columns you have, adding (col3,col4) might make it a covering index.

In InnoDB it might be to make the primary key (col1,col2), then it will cluster it, which is something of a win.

But once again, if your query joins 500,000 rows with no other WHERE clause, and returns 500,000 rows, it's not going to be fast, becuase it needs to fetch all of the rows to return them.


I don't think anyone else mentioned it, so I'm adding that you should have a compound (col1,col2) index on both tables:

CREATE MyIdx ON MyTable (col1,col2);

CREATE MyOtherIdx ON MyOtherTable (col1,col2);

And another point. An index on (col1,col2,col3,col4) will be helpful if you ever need to use a DISTINCT variation of your query:

Select DISTINCT
    MyTable.col3, MyTable.col4
From MyTable 
Inner Join MyOtherTable
On MyTable.col1 = MyOtherTable.col1
And MyTable.col2 = MyOtherTable.col2;
0

精彩评论

暂无评论...
验证码 换一张
取 消