I work on a large table with around 1.5k entries,
CREATE TABLE `crawler` (
`id` int(11) NOT NULL AUTO_INCREMENT,
...
`provider_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `crawler_provider_id` (`provider_id`),
...
) ENGINE=MyISAM ...
provider_id is used to link this table to another table 'providers, which was cleared and repopulated with new data. I was trying to recreate connections from 'crawler' to 'providers' (which should not really matter in case of MyISAM), but for some reason in my script MySQL returns zero rows if i don't provide a limit.
mysql> SELECT `crawler`.`id` FROM `crawler` WHERE `crawler`.`provider_id` > 1371;
Empty set (0.40 sec)
but
mysql> SELECT COUNT(*) FROM `crawler` WHERE `crawler`.`provider_id` > 1371;
| 346999 |
and
mysql> SELECT `crawler`.`id` FROM `crawler` WHERE `crawler`.`provider_id` > 1371 LIMIT 10;
10 rows in set (0.01 sec)
If I select some data from table and check it by myself I can see values greater than 1371.
I was able to fix this by deleting indexes (and recreating later), but I am extremely confused. I've never seen indexes going out of sync with table data (and I was unaware that they can affect values of returned rows). Unfortunately I haven't performed "CHECK TABLE" before deleting indexes, but it has开发者_如何转开发 "status=ok" right now, I can't see anything wrong in logs, and "REPAIR TABLE" shows no problems.
So, is this a common problem? What can be the reason? This server had some low RAM problems before, could it be the issue here as well?
Your query is almost certainly related to table corruption in MyISAM.
I did
root@localhost [kris]> create table crawler (
id integer not null auto_increment primary key,
provider_id int(11) DEFAULT NULL,
PRIMARY KEY (id),
KEY crawler_provider_id (provider_id)
) engine = myisam;
root@localhost [kris]> insert into crawler ( id, provider_id ) values ( NULL, 1 );</code>
and then repeated
root@localhost [kris]> insert into crawler ( id, provider_id)
select NULL, rand() * 120000 from crawler;
until I had
root@localhost [kris]> select count(*) from crawler;
+----------+
| count(*) |
+----------+
| 524288 |
+----------+
1 row in set (0.00 sec)
I now have
root@localhost [kris]> SELECT COUNT(*) FROM `crawler` WHERE `crawler`.`provider_id` > 1371;
+----------+
| COUNT(*) |
+----------+
| 518389 |
+----------+
1 row in set (0.27 sec)
which is somewhat comparable in size to what you gave in your example above. I do get two different plans for the query with and without a LIMIT clause.
Without a LIMIT clause I get a full table scan (ALL) not using any index:
root@localhost [kris]> explain SELECT `crawler`.`id` FROM `crawler` WHERE `crawler`.`provider_id` > 1371\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: crawler
type: ALL
possible_keys: crawler_provider_id
key: NULL
key_len: NULL
ref: NULL
rows: 524288
Extra: Using where
1 row in set (0.00 sec)
With the LIMIT clause, the INDEX is used for a RANGE access
root@localhost [kris]> explain SELECT `crawler`.`id` FROM `crawler` WHERE `crawler`.`provider_id` > 1371 LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: crawler
type: range
possible_keys: crawler_provider_id
key: crawler_provider_id
key_len: 5
ref: NULL
rows: 518136
Extra: Using where
1 row in set (0.00 sec)
In your example, without the LIMIT clause (full table scan) you get no data, but with the LIMIT clause (range access using index) you get data. That points to a corrupted MYD file.
ALTER TABLE, as REPAIR TABLE or OPTIMIZE TABLE, will normally copy the data and the kept indexes from the source table to a hidden new version of the table in a new format. When completed, the hidden new table will replace the old version of the table (which will be renamed to a hidden name, and then dropped).
That is, by dropping the indexes you effectively repaired the table.
Maybe you can delete and recreate the index, and after that repair or optimize the table so all indices get rebuilt. That may help you. And look at your configuration to see if the memory settings are appropriate.
精彩评论