I'm developing context discover system - which is mix of searching and suggestions.
Currently I'm looking for library for indexing. After some investigation I stayed on Lucene and Terrier and found Indri not comfortable.What are the downsides of both? What problem I can meet while using them?
Is it true that Terrier doesn't have incremental indexing (every time new document is added, I need to rebuild and reindex everything)?
My requirements are: - easy adding new documents - easy score methods injection - quiet well defined model
And one more thing:开发者_JS百科 is Terrier still active? I haven't seen any update since 10/03/2010 terrier changelog
What sort of database are you going to be using? Lucene, in my experience, is much better documented than Terrier.
Here's an article comparing Lucene and Terrier:
http://text-analytics.blogspot.com/2011/05/java-based-retrieval-toolkits.html
精彩评论