I'm building an ASP.NET MVC site where I plan to use Lucene.Net. I've envisioned a way to s开发者_如何学Gotructure the usage of Lucene, but not sure whether my planned architecture is OK and efficient.
My Plan:
- On
Application_Start
event in Global.asax: I check for the existence of the index on the file system - if it doesn't exist, I create it and fill it with documents extracted it from the database. - When new content is submitted: I create an
IndexWriter
, fill up a document, write to the index, and finally dispose of theIndexWriter
.IndexWriters
are not reused, as I can't imagine a good way to do that in an ASP.NET MVC application. - When content is edited: I repeat the same process as when new content is submitted, except that I first delete the old content and then add the edits.
- When a user searches for content: I check
HttpRuntime.Cache
to see if a user has already searched for this term in the last 5 minutes - if they have, I return those results; otherwise, I create anIndexReader
, build and run a query, put the results inHttpRuntime.Cache
, return them to the user, and finally dispose of theIndexReader
. Once again,IndexReaders
aren't reused.
My Questions:
- Is that a good structure - how can I improve it?
- Are there any performance/efficiency problems I should be aware of?
- Also, is not reusing the IndexReaders and IndexWriters a huge code smell?
The answer to all three of your questions is the same: reuse your readers (and possibly your writers). You can use a singleton pattern to do this (i.e. declare your reader/writer as public static). Lucene's FAQ tells you the same thing: share your readers, because the first query is reaaalllyyyy slow. Lucene handles all the locking for you, so there is really no reason why you shouldn't have a shared reader.
It's probably easiest to just keep your writer around and (using the NRT model) get the readers from that. If it's rare that you are writing to the index, or if you don't have a huge need for speed, then it's probably OK to open your writer each time instead. That is what I do.
Edit: added a code sample:
public static IndexWriter writer = new IndexWriter(myDir);
public JsonResult SearchForStuff(string query)
{
IndexReader reader = writer.GetReader();
IndexSearcher search = new IndexSearcher(reader);
// do the search
}
I would probably skip the caching -- Lucene is very, very efficent. Perhaps so efficent that it is faster to search again than cache.
The OnApplication_Start full index feels a bit off to me -- should probably be run in it's own thread so as not to block other expensive startup activities.
精彩评论