I have a table with a field that contains a bunch of neighborhood names. Some of these neighborhoods have names with 2 or more words. How can I get a list of words that are 3 or les开发者_如何学Pythons characters and occur in the middle of name with 3 or more words?
For example:
Lake = Do nothing, only 1 word
Golden Lake = Do nothing, only 2 words Lakes of Gold = Extract "of"In essence I want to make a list of 'garbage' words to remove when I build metaphone sentences.
SELECT 'Lake of gold' RLIKE '[[:<:]].+[[:>:]].+[[:<:]].{1,3}[[:>:]].+[[:<:]].+[[:>:]]'
Unfortunately, MySQL
can only match the regexps, not extract the patterns. You will have to do the filtering in MySQL
and extraction on the script side.
SELECT * FROM mytable WHERE mycolumn REGEXP "[[:alnum:]]+[[:space:]]+[[:alnum:]]{1,3}[[:space:]]+[[:alnum:]]+";
will find all entries that contain at least one word of up to 3 characters in between two other words.
You can't extract the words in MySQL directly, but this will filter the relevant rows. You have to do the extraction in a separate step.
精彩评论