开发者

speed up a sql query to mysql?

开发者 https://www.devze.com 2022-12-24 16:14 出处:网络
in my mysql database i\'ve got the geonames database, containing all countries, states and cities. i am using this to create a cascading menu so the user could select where he is from: country -> sta

in my mysql database i've got the geonames database, containing all countries, states and cities.

i am using this to create a cascading menu so the user could select where he is from: country -> state -> county -> city.

but the main problem is that the query will search through all the 7 millions rows in that table each time i want to get the list of children rows, and that is taking a while 10-15 seconds.

i开发者_运维百科 wonder how i could speed this up: caching? table views? reorganizing table structure somehow?

and most important, how do i do these things? are there good tutorials you could link to me?

i appreciate all help and feedback discussing smart ways of handling this issue!

UPDATE: here is my table structure:

CREATE TABLE `geonames_copy` (
  `geoname_id` mediumint(9) NOT NULL,
  `parent_id` mediumint(9) DEFAULT NULL,
  `name` varchar(200) DEFAULT NULL,
  `ascii_name` varchar(200) DEFAULT NULL,
  `alternate_names` varchar(4000) DEFAULT NULL,
  `latitude` decimal(10,7) DEFAULT NULL,
  `longitude` decimal(10,7) DEFAULT NULL,
  `feature_class` char(1) DEFAULT NULL,
  `feature_code` varchar(10) DEFAULT NULL,
  `country_code` varchar(2) DEFAULT NULL,
  `cc2` varchar(60) DEFAULT NULL,
  `admin1_code` varchar(20) DEFAULT NULL,
  `admin2_code` varchar(80) DEFAULT NULL,
  `admin3_code` varchar(20) DEFAULT NULL,
  `admin4_code` varchar(20) DEFAULT NULL,
  `population` bigint(20) DEFAULT NULL,
  `elevation` int(11) DEFAULT NULL,
  `gtopo30` smallint(6) DEFAULT NULL,
  `time_zone` varchar(40) DEFAULT NULL,
  `modification_date` date DEFAULT NULL,
  PRIMARY KEY (`geoname_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

and here is the sql query:

            $query = "SELECT geoname_id, name
                    FROM geonames
                    WHERE parent_id = '$geoname_id'
                    AND (feature_class = 'A')";

should i just create index for 2 columns: parent_id and feature_class?

one question: isn´t it better to create an index with solr instead of using mysql? one benefit is that im already using solr and another is that it supports full text search. so maybe it's better so i dont use both solr and mysql (2 things to be good at)?


As mentioned, more info would be helpful (Sql, database structure).

The AJAX suggestion is a good one, though you could also do this without ajax.

Do NOT execute a select at any point that selects all of the data. This will be extremely slow.

First, populate the only list of countries. Allow the user to make a selection from this list. After the user selects a country, either via AJAX, or by refreshing the entire page, populate the list of states for that country only - something like (select state from geonames where country = @country). When the user selects a state, populate the list of counties for that country and state - something like (select country from geonames where country = @country and state = @state). Continue in this manner for the city.

I'm not very familiar with MySql, but in SqlServer I would create an index on (Country, State, County, City) to speed up this set of queries. I'm not sure if MySql would be able to accelerate the entire set of queries with this index or not.

Of course, I'm making some assumptions about how your data is structured here, so this info may or may not be relevant.


Post your SQL for a better reply, but in general:

  • Make indexes on fields that you do joins/wheres on.
  • Do not use "SELECT *" -- only select the fields you need.
  • Hydrate as arrays instead of objects.

Also, if the menu never changes, cache the HTML in a file. You could even only cache the country/state HTML, then fetch cities via AJAX if they change often.


I believe stuff like this is usually done with AJAX. At the beginning, you only load the country names, and after one is selected, you dynamically load the state names in that country, then repeat for each subdivision after that.


This is a good scenario for partitioning a table and even having sub-partitions. You could partition the table by country, and then sub-partitioning by state. This will significantly reduce the amount of data your query will have to search through as huge segments of data can be removed from the execution plan.

Here is a good place to start for information on MySQL partitioning.

Along with the partitioning (and even if you choose not to partition), you'll want to create indexes on the columns your searching on as this will further enhance the performance of the queries.

Here is the MySQL documentation on HOW to create indexes, but really the tough part about making indexes is knowing what to index. Typically you'll target the columns that show up in the WHERE clauses in your query or on columns you JOIN on. This is pretty general, and you don't (and in many cases shouldn't) have to index every column in your where clauses, but this is a good place to start. Based on the limited data given in the question, you will most likely want a composite index on country and region in order to speed the selection of the cities. You'll want to use explain plan in order to determine when an index is necessary and whether or not it is actually being used by the query. Do a search on SO for "MySQL indexing" and you'll find more than enough information on the when, where, and hows of indexing tables.

If you haven't already, it will help to normalize your data. For example, if your table currently looks something like:

usa;fl;miami;....
usa;fl;orlando;....

It should be changed to something like:

COUNTRY Table:
--------------
COUNTRY_KEY            1
THREE_LETTER           'usa'
COUNTRY_NAME           'united states'
..OTHER COLUMNS....

REGION Table:
--------------
COUNTRY_KEY            1
REGION_KEY             10
REGION_CODE            'fl'
REGION_NAME            'florida'
..OTHER COLUMNS....

CITY Table:
--------------
REGION_KEY             10
CITY_KEY               20
CITY_NAME              'miami'
LAT                    123.12
LONG                   123.12
..OTHER COLUMNS----

From the standpoint of the UI, you'll want to write it in a manner where you're only populating the data necessary and then generating the other data entry points with the matching criteria. So on initial load, you'll populate the country input with a:

SELECT country_key, three_letter 
FROM COUNTRY 
ORDER BY three_letter;

When the user selects the country they are interested in, then you select out all the regions with that country key.

SELECT region_key, region_code 
FROM REGION WHERE country_key = :input_country_key 
ORDER BY region_code;

So on and so forth until you retrieve the users data.

Hope this helps.


ALTER TABLE geonames_copy ADD INDEX (parent_id, feature_class);

should do the trick. An index on just parent_id will probably work fine as well.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号