开发者

Improve speed of a JOIN in MySQL

开发者 https://www.devze.com 2023-01-03 10:27 出处:网络
I 开发者_运维技巧know there a similar threads around, but this is really the first time I realize that query speed might affect me - so it´s not that easy for me to really make the transfer from othe

I 开发者_运维技巧know there a similar threads around, but this is really the first time I realize that query speed might affect me - so it´s not that easy for me to really make the transfer from other folks problems.

That being said I have using the following query successfully with smaller data, but if I use it on what are mildly large tables (about 120,000 records). I am waiting for hours.

  INSERT INTO anothertable
  (id,someint1,someint1,somevarchar1,somevarchar1)
  SELECT DISTINCT md.id,md.someint1,md.someint1,md.somevarchar1,pd.somevarchar1
  FROM table1 AS md
  JOIN table2 AS pd
  ON (md.id = pd.id);

Tables 1 and 2 contain about 120,000 records. The query has been running for almost 2 hours right now. Is this normal? Do I just have to wait. I really have no idea, but I am pretty sure that one could do it better since it´s my very first try.

I read about indexing, but dont know yet what to index in my case?

Thanks for any suggestions - feel free to point my to the very beginners guides !


Assuming id is an auto-incremental PK, the DISTINCT is useless, since each row would be unique. In that case, removing it should also boost the performance, as SELECT DISTINCT can be quite slow.

And as previously mentioned, make sure the id field has index on both tables (which it does if it's PK).


Index the things you are joining on. In this case, create indexes on table1.id and table2.id. You should probably also have a foreign key from one table to the other, though without meaningful names, it is difficult to advise on the direction.


The only think you could index, that maybe get you some speed are the keys of the joins(md.id and pd.id). As they are most likely primary keys, they should be indexed already. Maybe a clustered index will bring something.

Is the DISTINCT really necessary? It just removes duplicates, and this can only be possible, if there are duplicate entrys in your source tables. I think DISTINCT is the biggest problem here.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号