I have a PHP web site with data stored in a MySql database. (approximately 50 000 article开发者_如何学JAVAs) I want to improve the results of the full text search functionality and stop using just a simple LIKE query.
I find Zend_Search_Lucene from the Zend framework that seems to be a great tool.
Do you think zend search lucene is a good choice in my case ?
After indexing all my articles with lucene, do I need to keep the data in MySql or zend search lucene is enough to keep all the data ?
Thanks in advance,
I would investigate if MySQLs native Full-Text Searching would meet your needs first before jumping to a Lucene based solution. It is a major improvement upon using LIKE
statements without the additional implementation required for Lucene.
Zend_Search_Lucene is a pure PHP implementation of Lucene and can therefore be pretty slow when used with large datasets. I would skip it and look at implementing Apache Solr. There is PECL extension for it, which is documented here.
I have used MySQL's fulltext on over 200,000 docs with a good amount of data and my search times are around .5 seconds to 2 seconds on popular terms and a very rare 5 or 6 second response every so often. I update some data each day so long term caching doesn't work the best but if I could cache searches I could be looking at .2 second times or lower after caching.
I am testing moving over to Zend Lucene and so far the same searches come in under 1.5 seconds for the most used terms.
All of the above is on a dedicated server with 2 gigs of ram and a core 2 duo.
I am no expert but for 50,000 articles I agree with Treffynnon to check out fulltext searching instead of using LIKE. If you do move to a new version of Zend Lucene I believe the indexes are compatible with the java version so it may make for a good gateway if down the road you add more articles and need more speed?
精彩评论