开发者

Mysql partitioning: Partitions outside of date range is included

开发者 https://www.devze.com 2022-12-24 00:27 出处:网络
I have just tried to configure partitions based on date, but it seems that mysql still includes a partition with no relevant data. It will use the relevant

I have just tried to configure partitions based on date, but it seems that mysql still includes a partition with no relevant data. It will use the relevant partition but also include the oldest for some reason. Am I doing it wrong?

The version is 5.1.44 (MyISAM)

I first added a few partitions based on "day", which is of type "date"

ALTER TABLE ptest
PARTITION BY RANGE(TO_DAYS(day))
(
PARTITION p1 VALUES LESS THAN (TO_DAYS('2009-08-01')),
PARTITION p2 VALUES LESS THAN (TO_DAYS('2009-11-01')),
PARTITION p3 VALUES LESS THAN (TO_DAYS('2010-02-01')),
PARTITION p4 VALUES LESS THAN (TO_DAYS('2010-05-01'))
);

After a q开发者_JAVA百科uery, I find that it uses the "old" partition, that should not contain any relevant data.

mysql> explain partitions select * from ptest where day between '2010-03-11' and '2010-03-12';
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+-------------+
| 1  | SIMPLE      | ptest      | p1,p4      | range | day           | day  | 3       | NULL | 79   | Using where |
+----+-------------+------------+------------+-------+---------------+------+---------+------+------+-------------+

When I select a single day, it works as expected:

mysql> explain partitions select * from ptest where day = '2010-03-11';
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+-------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref   | rows | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+-------+
| 1  | SIMPLE      | ptest      | p4         | ref  | day           | day  | 3       | const | 39   |       |
+----+-------------+------------+------------+------+---------------+------+---------+-------+------+-------+


This is actually the intended result because the oldest partition will always hold values that did not evaluate to a valid date (null). A workaround for this is to create an additional partition that will hold no data and exist for all values before your oldest date. This partition will always be scanned but has little affect on performance since it is empty.

http://bugs.mysql.com/bug.php?id=49754


You have partitioned on TO_DAYS(date), that means partitioning pruning will mostly only occur in simple cases unless you apply TO_DAYS(date) on the constraints.

You'll have to do e.g. select * from ptest where day between TO_DAYS('2010-03-11') and TO_DAYS('2010-03-12') - although it might be in this case shortcomings in mysql between.

Partitioning on dates in mysql is hard, and there's a lot of shortcomings in the partitioning implementation, atleast if you want to cover a lot of different query constraints, we usually place an integer id in the tables pointing to a calendar instead of a DATE type, as we've found mysql to handle partitioning on a simple integer to be quite reliable compared to partitioning on columns involving a function(such as TO_DAYS) we have

create table datatbl (
  time_id int NOT NULL,
  ....
);

time_id references a calendar prefilled with dates for the next 10 years looking like

create table calendar (
  time_id int primary key
  year int NOT NULL,
  month int NOT NULL,
  day int NOT NULL,
  dayofyear int NOT NULL,
  quarter int NOT NULL,
  is_weekend char(1) NOT NULL,
  db_date DATE not NULL,
  unique index(year,month,day),
  unique index(dbdate)
);

Queryies are joined to this table, so grabbing all data for a month requires just a where cal.year = 2010 and cal.month = 1 . Or it could be done as cal.db_date between '2010-01-01' and '2010-01-31'

datatbl is partitioned on time_id , and the above queries will make mysql do partitioning pruning. time_id is also a composite of year/month/date so the time_id for 2010-03-03 would be the integer 20100303 , that shouldn't be used for querying against, it is simply a convenience for the scripts that automatically create new/drop old partitions.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号