This query is very slow, taking about 1 second per record. Sadly, for (and because of) the size of the database, this is untenable as it will take days to complete.
Can you suggest a way to speed it up substantially? (I only need to run it once, but in a <1hr window ideally)
update participants set start_time = (s开发者_Python百科elect min(time_stamp)
from tasks where participant_id = participants.participant_id)
I don't think we need full table descriptions to suggest a more sensible query structure, but I can post them if required. The database is mysql.
Many thanks.
You would need to make sure there is an index on tasks.participant_id. Depending on the number of tasks per participant (if there are really many) you could also add an index on time_stamp, although I don't know if MySQL would make use of it.
You can do it with a temporary table like this:
create temporary table temp
select id as participant_id, min(time_stamp) as start_time
from participants inner join tasks on participants.participant_id = tasks.participant_id
group by participant_id;
update participants, temp
set start_time = temp.start_time
where participants.participant_id = temp.participant_id;
This replaces the correlated subquery with a much faster join.
Temporary tables are dropped automatically by the MySQL server when the MySQL connection to the client is closed, so depending on your application's connection handling you might want to drop it manually.
i think, you don't need an inner select
update participants set start_time = min(time_stamp)
Correction:
update participants
set start_time = min(tasks.time_stamp)
from participants inner join
tasks on participants.participant_id = tasks.participant_id
and with the correct foreign key and index settings it shouldn't take so long.
精彩评论