开发者

mysql select between two columns works too slowly

开发者 https://www.devze.com 2023-02-28 01:46 出处:网络
I have this query: SELECT `country` FROM `geoip_base` WHERE 1840344811 BETWEEN `start` AND `stop` It\'s badly use index (use, but parse big part of table) and work too slowly.

I have this query:

SELECT `country`
FROM `geoip_base`
WHERE 1840344811 BETWEEN `start` AND `stop`

It's badly use index (use, but parse big part of table) and work too slowly. I tried use ORDER BY and LIMIT, but it hasn't helped.

"start <= 1840344811 AND 1840344811 <= stop" works similar.

CREATE TABLE IF NOT EXISTS `geoip_base` (
  `start` decimal(10,0) NOT NULL,
  `stop` decimal(10,0) NOT NULL,
  `inetnum` char(33) collate utf8_bin NOT NULL,
  `country` char(2) collate utf8_bin NOT NULL,
  `city_id` int(11) NOT NULL,
  PRIMARY KEY  (`start`,`stop`),
  UNIQUE KEY `start` (`start`),开发者_StackOverflow中文版
  UNIQUE KEY `stop` (`stop`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

Table have 57,424 rows.

Explain for query "... BETWEEN START AND STOP ORDER BY START LIMIT 1": using key stop and get 24099 rows. Without order and limit, mysql doesn't use keys and gets all rows.


If your table is MyISAM, you can improve this query using SPATIAL indexes:

ALTER TABLE
        geoip_base
ADD     ip_range LineString;

UPDATE  geoip_base
SET     ip_range =
        LineString
                (
                Point(-1, `start`),
                Point(1, `stop`)
                );

ALTER TABLE
        geoip_base
MODIFY  ip_range NOT NULL;

CREATE SPATIAL INDEX
        sx_geoip_range ON geoip_base (ip_range);

SELECT  country
FROM    geoip_base
WHERE   MBRContains(ip_range, Point(0, 1840344811)

This article may be of interest to you:

  • Banning IP's

Alternatively, if your ranges do not intersect (and from the nature of the database I except they don't), you can create a UNIQUE index on geoip_base.start and use this query:

SELECT  *
FROM    geoip_base
WHERE   1840344811 BETWEEN `start` AND `stop`
ORDER BY
        `start` DESC
LIMIT 1;

Note the ORDER BY and LIMIT conditions, they are important.

This query is similar to this:

SELECT  *
FROM    geoip_base
WHERE   `start` <= 1840344811
        AND `stop` >= 1840344811
ORDER BY
        `start` DESC
LIMIT 1;

Using ORDER BY / LIMIT makes the query to choose descending index scan on start which will stop on the first match (i. e. on the range with the start closest to the IP you enter). The additional filter on stop will just check whether the range contains this IP.

Since your ranges do not intersect, either this range or no range at all will contain the IP you're after.


While Quassnoi's answer https://stackoverflow.com/a/5744860/1095353 is perfectly fine. The MySQL function (5.7) MBRContains(g1,g2) does not suit the full range of the IP's when using the select. MBRContains will contain [g1,g2[ not including the g2.

Using MBRTouches(g1,g2) allows for both [g1,g2] to be matched. Having IP blocks written inside the database as start and, stop columns would make this function more viable.

On a database table with ~6m rows (AWS db.m4.xlarge)

SELECT *, AsWKT(`ip_range`) AS `ip_range`
FROM `geoip_base` where `start` <= 1046519788 AND `stop` >= 1046519788;

~ 2-5 seconds

SELECT *, AsWKT(`ip_range`) AS `ip_range`
FROM `geoip_base` where MBRTouches(`ip_range`, Point(0,  INET_ATON('XX.XX.XX.XX')));

~ < 0.030 seconds

Source: MBRTouches(g1,g2) - https://dev.mysql.com/doc/refman/5.7/en/spatial-relation-functions-mbr.html#function_mbrtouches


Your table design is off.

You're using decimal but not allowing any zeroes. You immediately spend 5 bytes for storing such a number and simple INT would suffice (4 bytes).

After that, you create compound primary key (5 + 5 bytes) followed by 2 unique constraints (again 5 byte each) effectively making your index file almost the same size as the data file. That way, no matter what you index is extremely ineffective.

Using LIMIT doesn't force MySQL to use indexes, at least not the way you constructed your query. What will happen is that MySQL will obtain the dataset satisfying the condition and then discard the rows that don't conform to offset - limit.

Also, using MySQL's protected keywords (such as START and STOP) is a bad idea, you should never name your columns using protected keywords.

What would be useful is that you create your primary key as it is and don't index the columns separately. Also, configuring MySQL to use more memory would speed up execution.

For testing purposes I created a table similar to yours, I defined a compound key of start and stop and used the following query:

SELECT `country` FROM table WHERE 1500 BETWEEN `start` AND `stop` AND start >= 1500

My table is InnoDB type, I have 100k rows inserted, the query examines 87 rows this way and executes in a few milliseconds, my buffer pool size is 90% of the memory at my test machine. That might give insight into optimizing your query / db instance.


SELECT id FROM GEODATA WHERE start_ip <=(select INET_ATON('113.0.1.63')) AND end_ip >=(select INET_ATON('113.0.1.63')) ORDER BY start_ip DESC LIMIT 1;


The above example from Michael J.V. will not work: SELECT country FROM table WHERE 1500 BETWEEN start AND stop AND start >= 1500

BETWEEN start AND stop is the same as start <= 1500 AND end >= 1500

Thus you have start <= 1500 AND start >= 1500 in the same clause. So, only way it will succeed is if start=1500 and therefore the optimizer knows to use the start index.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号