I am a newb coder in a startup and I am implementing search of documents in a directory in a web host.
I am comparing Lucene/Solr, Whoosh, Sphinx and Xapian. Whoosh is natively python. But 开发者_高级运维I want your opinions on it too. Which of these have
- mature and easy to use and install interfaces with python? (Whoosh is a no-brainer)
- no chance for crashes, bottlenecks and other failures
- best documented interface (Im not reading PHP docs because python docs were sparse)
- easiest to get up and running (only one has a quick-start tutorial)
Speaking for Apache Solr, Python has several Solr clients, which I've collected based on feedback from our customers at Websolr:
- Haystack is very popular, and designed for seamless integration within Django apps. If you're developing a Django app, Haystack is for you.
- Sunburnt looks to be more generic than Haystack, and is also very well documented. If you're doing plain ol' Python, Sunburnt is worth a look.
Other Python Solr clients that I've found, which seem a bit lower level...
- solrpy
- pysolr (I know, right?)
- Insol
Some more details about how your app is built (in particular, is it a Django app?) would help narrow things down from here. Good luck finding the best fit for your app!
Use Whoosh if you don't need the speed, extra features of the alternatives. It's great, has a nice API, good documentation. My second choice would probably be Xapian, which is fast and has a fairly decent API. They are all fairly mature products. If you don't know what you really need, I'd just go with Whoosh for now.
If you want quick python integration, try indextank. You can be up and running in 2 minutes, and it's free.
For the other alternatives, I'd go with Solr (provided you want to host the search servers yourself, or signup for websolr )
Disclaimer: I work at indextank.
精彩评论