开发者

How can I speed up a mysql query? 2 joins, 3 text conditions, ~3,000k records

开发者 https://www.devze.com 2023-03-18 08:18 出处:网络
This is my Query: SELECT distinct decks.deckid, tmnt.tournamentid, tmnt.tournamentn开发者_开发百科ame,

This is my Query:

SELECT distinct decks.deckid, tmnt.tournamentid, tmnt.tournamentn开发者_开发百科ame,
        cast(concat(Monthname(tournamentdate),' ',day(tournamentdate),', ',
        year(tournamentdate)) as char) as ConfiguredTournamentDate,
        tournamentdate, decks.pilot, decks.deckname, decks.record
FROM tournaments tmnt
LEFT JOIN decks on decks.tournamentid = tmnt.tournamentid
LEFT JOIN cardsindeck cid on cid.deckid = decks.deckid
WHERE upper(decks.deckname) like '%JULIAN23%'
OR upper(decks.pilot) like '%JULIAN23%'
OR upper(cid.cardname) like '%JULIAN23%'
ORDER BY tournamentdate desc;

This query returns 12 results, of a possible 44K.

  1. Tournaments - 2,775 records
  2. Deckname - 44,400 records
  3. Cardsindeck - 3,330,000 records

I have FULLTEXT indexes built (I'll include the explain below). This query takes 3-6 seconds, which isn't bad given the records I guess, but I want to see what I can do to make it faster. The data is only updated once a day, could I get somewhere building a temp table at that time and querying that?

I don't have a better way to ask than that, basically I wanted to see if there were any other methods I could employ here.

EXPLAIN:

1       SIMPLE  tmnt    ALL                                     2772    Using temporary; Using filesort
1       SIMPLE  decks   ref     TourneyID_idx   TourneyID_idx   5       magic_decks_july.tmnt.TournamentID      12
1       SIMPLE  cid     ref     DeckID_idx      DeckID_idx      5       magic_decks_july.decks.DeckID   24      Using where; Distinct

UPDATE

This reduced the query to .5 seconds. Muuuuuch more acceptable. Now to dig into the other 3 problem areas... :)

SELECT distinct decks.deckid, tmnt.tournamentid, tmnt.tournamentname, 
    cast(concat(Monthname(tournamentdate),' ',day(tournamentdate),', 
    ',year(tournamentdate)) as char) as ConfiguredTournamentDate, tournamentdate, 
    decks.pilot, decks.deckname, decks.record 
from tournaments tmnt 
inner join decks on decks.tournamentid = tmnt.tournamentid 
where decks.deckname like '%CONDESCEND%' 
OR decks.pilot like '%CONDESCEND%' 
UNION ALL
SELECT distinct decks.deckid, tmnt.tournamentid, tmnt.tournamentname,
    cast(concat(Monthname(tournamentdate),' ',day(tournamentdate),', 
    ',year(tournamentdate)) as char) as ConfiguredTournamentDate, tournamentdate, 
    decks.pilot, decks.deckname, decks.record 
from cardsindeck cid
left join decks on cid.deckid = decks.deckid 
left join tournaments tmnt on decks.tournamentid = tmnt.tournamentid 
where cid.cardname like '%CONDESCEND%'
order by tournamentdate


It looks like you would have better results if you rewrote your query to filter right within the join statements.

Example,

SELECT *
FROM (select now() query_time) qt
INNER JOIN table_with_3mil_rows t1 ON value rlike "test"
INNER JOIN smaller_table        t2 ON value rlike "test"

The reasoning behind the subselect for the current query time is two fold. Prevent caching of the query, and allows for you to do the most amount of filtering right at the start of the result set building, where it can do the most good.

Mysql builds querys table by table, so when it opens the first table for read, it's pulling in all the rows EXCEPT for what the ON clause filters out. Then it reads in the next table, filters out stuff that doesn't join right, and adds those to the table, and on until the joins are complete.

THEN it does the filtering called for by the where clause.

If you reorder it so it filters on your deck name right up front, you reduce the size of the table in memory and therefore speed the query up.

I would also suggest using partioning on your cards table. At 3 million records, that's huge. Partitioning off of deck ID would help


add

...
 LIMIT 12

... also I think that creating a dedicated table with pre-calculated rsults is an excellent and simplest solution if the data are updated only once per day. Just recalc your table every time the data got updated and the execute plain SELECT * FROM CACHED_DATA;


Your problem is the use of UPPER - it is invoked in the where clause for every row, which is very expensive

In mysql, like is case insensitive, so using UPPER has no effect in your query.

Try this:

SELECT distinct decks.deckid, tmnt.tournamentid, tmnt.tournamentname,
    cast(concat(Monthname(tournamentdate),' ',day(tournamentdate),', ',
    year(tournamentdate)) as char) as ConfiguredTournamentDate,
    tournamentdate, decks.pilot, decks.deckname, decks.record
FROM tournaments tmnt
LEFT JOIN decks on decks.tournamentid = tmnt.tournamentid
LEFT JOIN cardsindeck cid on cid.deckid = decks.deckid
WHERE decks.deckname like '%julian23%' -- Removed upper() usage
OR decks.pilot like '%julian23%' -- Removed upper() usage
OR cid.cardname like '%julian23%' -- Removed upper() usage
ORDER BY tournamentdate desc;
0

精彩评论

暂无评论...
验证码 换一张
取 消