I'm implementing an ingredient text search, for adding ingredients to a recipe. I've currently got a full text index on the ingredient name, which is stored in a single text field, like so:
"Sauce, tomato, lite, Heinz"
I've found that because there are a lot of ingredients with very similar names in the database, simply sorting by relevance doesn't work that well a lot of the time. So, I've found myself sorting by a bunch of my own rules of thumb, which probably duplicat开发者_如何学Pythones a lot of the full-text search algorithm which spits out a numerical relevance. For instance (abridged):
ORDER BY [ingredient name is exactly search term], [ingredient name starts with search term], [ingredient name starts with any word from the search and contains all search terms in some order], [ingredient name contains all search terms in some order],
...and so on. Each of these is defined in the SELECT specification as an expression returning either 1 or 0, and so I order by those in sequential order.
I would love to hear suggestions for:
- A better way to define complicated order-by criteria in one place, say perhaps in a view or stored procedure that you can pass just the search term to and get back a set of results without having to worry about how they're ordered?
- A better tool for this than MySQL's fulltext engine -- perhaps if I was using Sphinx or something [which I've heard of but not used before], would I find some sort of complicated config option designed to solve problems like this?
- Some google search terms which might turn up discussion on how to order text items within a specific domain like this? I haven't found much that's of use.
Thanks for reading!
Jeremy,
What you are looking for is Rank Boosting which is supported by Solr. Here is a link where you can read more about this:
http://wiki.apache.org/solr/SolrRelevancyCookbook#Ranking_Terms
精彩评论