I'm loading text files into my db and trying to do some quick matching between a table that lists names of organizations, and a table that holds the text file and potential matches to those organizations.
I load the file using LOAD INFILE CONCURRENT
and don't have any problems with that.
The twist comes from the fact that the field I'm trying to match in the raw text table (occupationoraffiliation
) has more than just organization names in it. So I'm trying to use LIKE
with wildcards to match the strings.
To match the text, I'm trying to use this query:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE CONCAT('%',o.org_name,'%')
SET raw_faca.org_id = o.org_id;
I've also tried without CONCAT
:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE ('%' + o.org_name + '%')
SET raw_faca.org_id = o.org_id;
The raw_faca
table has ~40,000 rows and the orgs
table has ~ 20,000 rows. I have indexes on all the The query has been running for a couple of hours or so -- this seems like way too long for the operation. Is the comparison I'm trying to run just that inefficient or am I doing something spectacularly stupid here? I was hoping to avoid going line-by-line with an external php or python script.
In response to comments below about using Match . . . Against
, I've tried the following query as well:
UPDATE raw_faca JOIN orgs AS o ON raw_faca.org_id IS NULL AND MATCH(raw_faca.occupationoraffiliation) AGAINST (o.org_name IN NATURAL LANGUAGE MODE)
SET raw_faca.org_id = o.org_id;
And it's giving me this error:
incorrect argument开发者_运维知识库s to AGAINST
Any thoughts?
A LIKE
clause with a leading wild card is not going to be able to take advantage of any indexes.
精彩评论