开发者

MySQL Left Join, Group By, Order By, Limit = Terrible Performance

开发者 https://www.devze.com 2023-01-20 09:37 出处:网络
I am currently developing a an application to allow users to search through a database of documents using various paramaters and returning a set of paged results. I am building it in PHP/MySQL, which

I am currently developing a an application to allow users to search through a database of documents using various paramaters and returning a set of paged results. I am building it in PHP/MySQL, which is not my usual development platform, but its been grand so far.

The problem I am having is that in order to return a full set of results I have to use LEFT JOIN on every table, which completely destroys my performance. The person who developed the database has said that the query I am using will return the correct results, so thats what I have to use. The query is below, I am by no means an SQL Guru and could use some help on this.

I have been thinking that it might be better to split the query into sub-queries? Below is my current query:

    SELECT d.title, d.deposition_id, d.folio_start, d.folio_end, pl.place_id, p.surname, p.forename, p.person_type_id, pt.person_type_desc, p.age, d.manuscript_number, dt.day, dt.month, dt.year, plc.county_id, c.county_desc
 FROM deposition d 
 LEFT JOIN person AS p ON p.deposition_id = d.deposition_id 
 LEFT JOIN person_type AS pt ON p.person_type_id = pt.person_type_id 
 LEFT JOIN place_link AS pl ON pl.deposition_id = d.deposition_id 
 LEFT JOIN date AS dt ON dt.deposition_id = d.deposition_id 
 LEFT JOIN place AS plc ON pl.place_id = plc.place_id 
 LEFT JOIN county AS c ON plc.county_id = c.county_id
 WHERE 1 AND 开发者_如何学Pythond.manuscript_number = '840' 
 GROUP BY d.deposition_id ORDER BY d.folio_start ASC
 LIMIT 0, 20

Any help or guidance would be greatly appreciated!

Deposition Table:

CREATE TABLE IF NOT EXISTS `deposition` (
  `deposition_id` varchar(11) NOT NULL default '',
  `manuscript_number` int(10) NOT NULL default '0',
  `folio_start` varchar(4) NOT NULL default '0',
  `folio_end` varchar(4) default '0',
  `page` int(4) default NULL,
  `deposition_type_id` int(10) NOT NULL default '0',
  `comments` varchar(255) default '',
  `title` varchar(255) default NULL,
  PRIMARY KEY  (`deposition_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Date Table

CREATE TABLE IF NOT EXISTS `date` (
  `deposition_id` varchar(11) NOT NULL default '',
  `day` int(2) default NULL,
  `month` int(2) default NULL,
  `year` int(4) default NULL,
  PRIMARY KEY  (`deposition_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Person_Type

CREATE TABLE IF NOT EXISTS `person_type` (
  `person_type_id` int(10) NOT NULL auto_increment,
  `person_type_desc` varchar(255) NOT NULL default '',
  PRIMARY KEY  (`person_type_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=59 ;


Seems that you want to select one person, place etc. per deposition.

The query you wrote will return you this, but it's not guaranteed which one will it return, and the query is inefficient.

Try this:

SELECT  d.title, d.deposition_id, d.folio_start, d.folio_end, pl.place_id, p.surname, p.forename, p.person_type_id, pt.person_type_desc, p.age, d.manuscript_number, dt.day, dt.month, dt.year, plc.county_id, c.county_desc
FROM    deposition d
LEFT JOIN
        person p
ON      p.id = 
        (
        SELECT  id
        FROM    person pi
        WHERE   pi.deposition_id = d.deposition_id
        ORDER BY
                pi.deposition_id, pi.id
        LIMIT 1
        )
LEFT JOIN
        place_link AS pl
ON      pl.id = 
        (
        SELECT  id
        FROM    place_link AS pli
        WHERE   pli.deposition_id = d.deposition_id
        ORDER BY
                pli.deposition_id, pi.id
        LIMIT 1
        )
LEFT JOIN
        date AS dt
ON      dt.id = 
        (
        SELECT  id
        FROM    date AS dti
        WHERE   dti.deposition_id = d.deposition_id
        ORDER BY
                dti.deposition_id, pi.id
        LIMIT 1
        )
LEFT JOIN
        place AS plc
ON      plc.place_id = pl.place_id 
LEFT JOIN
        county AS c
ON      c.county_id = plc.county_id
WHERE   d.manuscript_number = '840' 
ORDER BY
        d.manuscript_number, d.folio_start
LIMIT   20

Create an index on deposition (manuscript_number, folio_start) for this to work fast

Also create a composite index on (deposition_id, id) on person, place_link and date.


The poor performance is almost certainly from lack of indexes. Your deposition table doesn't have any indexes, and that probably means the other tables you're referencing don't have any either. You can start by adding an index to your deposition table. From the MySQL shell, or phpMyAdmin, issue the following query.

ALTER TABLE deposition ADD INDEX(deposition_id, manuscript_number);

You know you're on the right track if the query executes faster after adding the index. From there you might want to put indexes on the other tables on the referenced columns. For instance for this part of your query "LEFT JOIN person AS p ON p.deposition_id = d.deposition_id", you could try adding an index to the person table using.

ALTER TABLE person ADD INDEX(deposition_id);


You only need a LEFT JOIN if the joined table might not have a matching value. Is it possible in your database schema for a person to not have a matching person_type? Or deposition to not have a matching row in date? A place not have a matching county?

For any of those relationships that must exist for the result to make sense you can change the LEFT JOIN to an INNER JOIN.

These columns should have indexes (unique if possible):

person.deposition_id
date.deposition_id
place_link.deposition_id
place_link.place_id

The date table looks like a bad design; I can't think of a reason to have a table of dates instead of just putting a column of type date (or datetime) in the deposition table. And date is a terrible name for a table because it's a SQL reserved word.

0

精彩评论

暂无评论...
验证码 换一张
取 消