The website I have to manage is a search engine for worker (yellow page style)
I have created a database like this:
People: <---- 4,000,000 records
id
name
address
id_activity <--- linked to the activites table
tel
fax
id_region <--- linked to the regions table
activites: <---- 1500 activites
id
name_activity
regions: <--- 95 regions
id
region_name
locations: <---- 4,000,000 records
id_people
lat
lon
So basically the request that I am having slow problem with is to select all the "workers" around a selecty city (select by the user)
The request I have created is fully working but takes 5-6 seconds to return results...
Basically I do a select on the table locations to select all the city in a certain radius and then join to the people table
SELECT people.*,id, lat, lng, poi,
(6371 * acos(cos(radians(plat)) * cos(radians(lat)) * cos(radians(lng) - radians(plon)) + sin(radians(plat)) * sin(radians(lat)))) AS distance
FROM locations,
people
WHERE locations.id = people.id
HAVING distance < dist
ORDER BY distance LIMIT 0 , 20;
My questions are:
- Is my Database nicely designed? I don't know if it's a good idea to have 2 table with 4,000,000 records each. Is it OK to do a select on it?
- Is my request badly designed?
- How can I speed up the search开发者_C百科?
The design looks normalized. This is what I would expect to see in most well designed databases. The amount of data in the tables is important, but secondary. However if there is a 1-to-1 correlation between
People
andLocations
, as appears from your query, I would say the tables should be one table. This will certainly help.Your SQL looks OK, though adding constraints to reduce the number of rows involved would help.
You need to index your tables. This is what will normally help most with slowness (as most developers don't consider database indexes at all).
There are a couple of basic things that could be making your query run slowly.
What are your indexes like on your tables? Have you declared primary keys on the tables? Joining two tables each with 4M rows without having indexes causes a lot of work on the DB. Make sure you get this right first.
If you've already built the right indexes for your DB you can look at caching data. You're doing a calculation in your query Are the locations (lat/lon) generally fixed? How often do they change? Are the items in your locations table actual places (cities, buildings, etc), or are they records of where the people have been (like Foursquare checkins)?
If your locations are places you can make a lot of nice optimizations if you isolate the parts of your data that change infrequently and pre-calculate the distances between them.
If all else fails, make sure your database server has enough RAM. If the server can keep your data in memory it will speed things up a lot.
精彩评论