Efficient SQL Query Design_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-30 03:14 出处：网络

What is generally considered the most efficient way to do this type of query? We have a database of 10 years worth of laboratory data and we would like to select out performance data for various tes

What is generally considered the most efficient way to do this type of query?

We have a database of 10 years worth of laboratory data and we would like to select out performance data for various tests. This query for example will select the number of hours its taken to do a test and calculate an average turnaround time and allow us to plot a sparkline of avg TAT per day.

Say we have 100 test names is it acceptable in terms of performance to iterate over the test names in a loop and fire this query off once per loop? Or is there a more efficient way?

SELECT 
  Date_Authorised_Index.Date_Authorised
  , Result_Set.Date_Booked_In
  , avg(DATEDIFF('hh',Result_Set.Date_Time_Booked_In,Result_Set.Date_Time_Authorised)) as HrsIn
  , count(Date_Authorised_Index.Date_Authorised) as numbers
  , Date_Authorised_Index.Registration_Number
  , Date_Authorised_Index.Request_Row_ID
  , Date_Authorised_Index.Specimen_Number
  , Result_Set.Authorised_By
  , Result_Set.Namespace
  , Result_Set.Set_Code
  , Result_Set.Date_Time_Authori开发者_运维百科sed
  , Request.Date_Time_Received
  , Request.Location 
FROM 
  Date_Authorised_Index Date_Authorised_Index
  , Result_Set Result_Set
  , Request 
WHERE 
  Date_Authorised_Index.Date_Authorised = Result_Set.Date_Authorised 
  AND Date_Authorised_Index.Request_Row_ID = Request.Request_Row_ID
  AND Date_Authorised_Index.Request_Row_ID = Result_Set.Request_Row_ID 
  AND (Date_Authorised_Index.Discipline='C') AND Result_Set.Set_Code=? 
GROUP BY 
  Result_Set.Date_Booked_In

For starters I would rewrite this query so it uses explicit join syntax.
Also even though MySQL does not force you to restate every non-aggregate column in the group by clause that doesn't mean that's a good thing.
Unless the Result_Set.Date_Booked_In uniquely identifies a row, you are selecting random values from a multiple of rows.

SELECT 
  dai.Date_Authorised
  , rs.Date_Booked_In
  , avg(DATEDIFF('hh',rs.Date_Time_Booked_In,rs.Date_Time_Authorised)) as HrsIn
  , count(dai.Date_Authorised) as numbers
  , dai.Registration_Number
  , dai.Request_Row_ID
  , dai.Specimen_Number
  , rs.Authorised_By
  , rs.Namespace
  , rs.Set_Code
  , rs.Date_Time_Authorised
  , r.Date_Time_Received
  , r.Location 
FROM 
  Date_Authorised_Index dai    
INNER JOIN Result_Set rs ON (dai.Date_Authorised = rs.Date_Authorised
                         AND dai.Request_Row_ID = rs.Request_Row_ID)
INNER JOIN Request R ON (dai.Request_Row_ID = r.Request_Row_ID)
WHERE 
  (dai.Discipline= 'C') AND rs.Set_Code=? 
GROUP BY 
  rs.Date_Booked_In

If you want to select a 100 rows in one go, just make a new table with the set_codes you want to select and join against that.
Make sure you index the field sc.set_code (or better yet make it the primary key)

SELECT lots_of_columns
FROM table1 as dai
INNER JOIN table2 as rs ON (what you joined on before)
INNER JOIN table3 as r ON (same here)
INNER JOIN Setcodes as sc ON (sc.Set_code = rs.SetCode)  <<-- extra join.
WHERE 
  dai.discipline = 'C'
GROUP BY  rs.Date_Booked_In

Or you can even use a `IN (...) like below, although that will propably be slower than a join.

SELECT lots_of_columns
FROM table1 as dai
INNER JOIN table2 as rs ON (what you joined on before)
INNER JOIN table3 as r ON (same here)
WHERE 
  dai.discipline = 'C' AND rs.Set_Code IN (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
GROUP BY  rs.Date_Booked_In