开发者

MySQL Match Fulltext

开发者 https://www.devze.com 2023-01-30 08:50 出处:网络
Im\' trying to do a fulltext search with mysql, to match a string. The problem is that it\'s returning odd results in the first place.

Im' trying to do a fulltext search with mysql, to match a string. The problem is that it's returning odd results in the first place.

For example, the string 'passat 2.0 tdi' :

            AND MATCH (
            records_veiculos.titulo, records_veiculos.descricao
            )
            AGAINST (
             'passat 2.0 tdi' WITH QUERY EXPANSION
            )

is returning this as the first result (the others are fine) :

Volkswagen Passat Variant 1.9 TDI- ANO 2003

wich is incorrect, since there's no "2.0" in this example.

What could it be?

edit: Also, since this will probably be a large database (expecting up to 500.000 records), will this search method be the best for itself, or would it be better to install any other search engine like Sphinx? Or in case it doesn't, how to show relevant results?

edit2: For the record, despite the question being开发者_JS百科 marked as answered, the problem with the MySQL delimiters persists, so if anyone has a suggestion on how to escape delimiters, it would be appreciated and worth the 500 points at stake. The sollution I found to increase the resultset was to replace WITH QUERY EXPANSION with IN BOOLEAN MODE, using operators to force the engine to get the words I needed, like :

AND MATCH (
records_veiculos.titulo, records_veiculos.descricao
)
AGAINST (
 '+passat +2.0 +tdi' IN BOOLEAN MODE
)

It didn't solve at all, but at least the relevance of the results as changed significantly.


From the MySQL documentation on Fulltext search:

"The FULLTEXT parser determines where words start and end by looking for certain delimiter characters; for example, “ ” (space), “,” (comma), and “.” (period)."

This means that the period is delimiting the 2 and 0. So it's not looking for '2.0'; it's looking for '2' and '0', and not finding it. WITH QUERY EXPANSION is probably causing relevant related words to show up, thus obviating the need for '2' and '0' to be individual words in the result rankings. A character minimum may also be being enforced.


By default I believe mysql only indexes and matches words with 4 or more characters. You could also try escaping the period? It might be ignored this or otherwise using it as a stop character.


What is the match rank that it returns for that? Does the match have to contain all "words" my understanding was it worked like Google and only needs to match some of the words.

Having said that, have a mind to the effect of adding WITH QUERY EXPANSION, that automatically runs a second search for "related" words, which may not be what you have typed, but which the fulltext engines deems probably related.

Relevant Documentation: http://dev.mysql.com/doc/refman/5.1/en/fulltext-query-expansion.html


The "." is what's matching on 2003 in your query results.

If you're going to do searches on 3 character text strings, you should set ft_min_word_len=3 in your mysql config, restart mysql. Otherwise, a search for "tdi" will return results with "TDI-" but not with just "TDI", because rows with "TDI-" will be indexed but "TDI" alone will not.

After making that config change, you'll have to rebuild your index on that table. (Warning: your index might be significantly larger now.)

0

精彩评论

暂无评论...
验证码 换一张
取 消