I'm running the following query in MYSQL
select distinct straight_join
cu.entryid entryid,
t0.tokpos starting_position,
t3.tokpos ending_position,
t0.idxsent idxsent,
'TOKENS_44_340' tablename
from
TOKENS_44_340 t0,
constraints_appraisal cu,
TOKENS_44_340 t1,
TOKENS_44_340 t2,
TOKENS_44_340 t3
where
t0.token_surface = cu.token_0
and (cu.pos_0 is null OR t0.penntag like concat(cu.pos_0,'%'))
and t1.token_surface = cu.token_1
and (cu.pos_1 is null OR t1.penntag like concat(cu.pos_1,'%'))
and t2.token_surface = cu.token_2
and (cu.pos_2 is null OR t2.penntag like concat(cu.pos_2,'%'))
and t3.token_surface = cu.token_3
and (cu.pos_3 is null OR t3.penntag like concat(cu.pos_3,'%'))
and t0.tokpos = t1.tokpos - 1
and t1.tokpos = t2.tokpos - 1
and t2.tokpos = t3.tokpos - 1
and cu.token_4 is null
and cu.token_5 is null
and cu.token_6 is null
and cu.token_7 is null
and cu.token_8 is null
and cu.token_9 is null;
MySQL gives me the following query plan for this query:
+----+-------------+-------+------+---------------------------------------+------------------------+---------+------+------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------------------------------+------------------------+---------+------+------+-----------------+
| 1 | SIMPLE | t0 | ALL | PRIMARY,TOKENS_44_340_index_44,tokpos | NULL | NULL | NULL | 49 | Using temporary |
| 1 | SIMPLE | cu | ALL | NULL | NULL | NULL | NULL | 7907 | Using where |
| 1 | SIMPLE | t1 | ref | PRIMARY,TOKENS_44_340_index_44,tokpos | TOKENS_44_340_index_44 | 399 | func | 4 | Using where |
| 1 | SIMPLE | t2 | ref | PRIMARY,TOKENS_44_340_index_44,tokpos | TOKENS_44_340_index_44 | 399 | func | 4 | Using where |
| 1 | SIMPLE | t3 | ref | TOKENS_44_340_index_44 | TOKENS_44_340_index_44 | 399 | func | 4 | Using where |
+----+-------------+-------+------+---------------------------------------+------------------------+---------+------+------+-----------------+
5 rows in set (0.00 sec)
As you can see, MySQL doesn't even so much as acknowledge the existance of my index token_0
on constraints_appraisal(token_0)
. Any idea why it's ignoring my index, and what I can do about it? I'm running mysql 5.0.51a-24+lenny4 for Debian stable.
P.S. I know I could make this query run faster by removing the straight_join
constraint and let it use the token_surface index
on t0
, but it still wouldn't be as fast as it could be using the token_0
index on constraints_appraisal
. I added the straight_join
so that I could make my specific issue appear more clearly, and I plan remove it when I have the index working properly.
mysql> d开发者_Python百科escribe TOKENS_44_340;
+---------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+-------+
| tokPos | int(11) | NO | PRI | NULL | |
| linePos | int(11) | YES | | NULL | |
| EOLs | int(11) | YES | | NULL | |
| idxsent | int(11) | YES | | NULL | |
| possent | int(11) | YES | | NULL | |
| brilltag | int(11) | YES | | NULL | |
| token_surface | varchar(132) | YES | MUL | NULL | |
| wordLen | int(11) | YES | | NULL | |
| capitalized | int(11) | YES | | NULL | |
| wordType | int(11) | YES | | NULL | |
| numDigit | int(11) | YES | | NULL | |
| numPunc | int(11) | YES | | NULL | |
| numAlpha | int(11) | YES | | NULL | |
| maxRep | int(11) | YES | | NULL | |
| pre1 | varchar(132) | YES | | NULL | |
| pre2 | varchar(132) | YES | | NULL | |
| pre3 | varchar(132) | YES | | NULL | |
| pre4 | varchar(132) | YES | | NULL | |
| suf1 | varchar(132) | YES | | NULL | |
| suf2 | varchar(132) | YES | | NULL | |
| suf3 | varchar(132) | YES | | NULL | |
| suf4 | varchar(132) | YES | | NULL | |
| dep_gov | int(11) | YES | MUL | NULL | |
| dep_rel | varchar(20) | YES | MUL | NULL | |
| penntag | varchar(30) | YES | | NULL | |
+---------------+--------------+------+-----+---------+-------+
25 rows in set (0.04 sec)
mysql> describe constraints_appraisal;
+---------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+-------+
| entryid | int(11) | NO | PRI | 0 | |
| context | varchar(50) | YES | | NULL | |
| syntax | int(11) | YES | | NULL | |
| token_0 | varchar(50) | YES | MUL | NULL | |
| pos_0 | varchar(50) | YES | | NULL | |
| porter_0 | varchar(50) | YES | MUL | NULL | |
| token_1 | varchar(50) | YES | | NULL | |
| pos_1 | varchar(50) | YES | | NULL | |
| porter_1 | varchar(50) | YES | | NULL | |
| token_2 | varchar(50) | YES | | NULL | |
| pos_2 | varchar(50) | YES | | NULL | |
| porter_2 | varchar(50) | YES | | NULL | |
| token_3 | varchar(50) | YES | | NULL | |
| pos_3 | varchar(50) | YES | | NULL | |
| porter_3 | varchar(50) | YES | | NULL | |
| token_4 | varchar(50) | YES | | NULL | |
| pos_4 | varchar(50) | YES | | NULL | |
| porter_4 | varchar(50) | YES | | NULL | |
| token_5 | varchar(50) | YES | | NULL | |
| pos_5 | varchar(50) | YES | | NULL | |
| porter_5 | varchar(50) | YES | | NULL | |
| token_6 | varchar(50) | YES | | NULL | |
| pos_6 | varchar(50) | YES | | NULL | |
| porter_6 | varchar(50) | YES | | NULL | |
| token_7 | varchar(50) | YES | | NULL | |
| pos_7 | varchar(50) | YES | | NULL | |
| porter_7 | varchar(50) | YES | | NULL | |
| token_8 | varchar(50) | YES | | NULL | |
| pos_8 | varchar(50) | YES | | NULL | |
| porter_8 | varchar(50) | YES | | NULL | |
| token_9 | varchar(50) | YES | | NULL | |
| pos_9 | varchar(50) | YES | | NULL | |
| porter_9 | varchar(50) | YES | | NULL | |
| token_surface | varchar(200) | YES | | NULL | |
| fileid | varchar(100) | YES | | NULL | |
+---------------+--------------+------+-----+---------+-------+
35 rows in set (0.06 sec)
mysql> show index from constraints_appraisal;
+-----------------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| constraints_appraisal | 0 | PRIMARY | 1 | entryid | A | 7907 | NULL | NULL | | BTREE | |
| constraints_appraisal | 1 | token_0 | 1 | token_0 | A | NULL | NULL | NULL | YES | BTREE | |
| constraints_appraisal | 1 | porter_0 | 1 | porter_0 | A | NULL | NULL | NULL | YES | BTREE | |
+-----------------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.05 sec)
It appears that the problem is that the two tables are in different character sets.
TOKENS_44_340
is inutf8
constraints_appraisal
is inlatin1
精彩评论