开发者

Forgot to close the Lucene IndexWriter after adding Documents to the index

开发者 https://www.devze.com 2023-02-18 12:05 出处:网络
I had a progr开发者_StackOverflow社区am running for 2 days to build a Lucene index for around 160 million text files, and after the program ended, I tried searching the index and found the index was n

I had a progr开发者_StackOverflow社区am running for 2 days to build a Lucene index for around 160 million text files, and after the program ended, I tried searching the index and found the index was not correctly built, indexReader.numDocs() returned 0. I checked the index directory, it looked good, all the index data seemed to be there, the directory is 1.5 Gigabytes in size.

I checked my code and found that I forgot to call indexWriter.optimize() and indexWriter.close(), I want to know if it is possible to re-optimize() the index so I don't need to rebuild the whole index from scratch? I don't really want the program to take another 2 days.


Calling IndexWriter.optimize() is not necessary and can be called at a later time by reopening the index. It just optimizes the documents in the index for better read performance and doesn't otherwise affect anything.

If you forgot to call IndexWriter.close() however then your index might not be complete. Since you processed so many documents it likely flushed most of them, so hopefully you only need to re-index the last ones. Use Luke as suggested for a UI to quickly browse the index to see what state it's in.

0

精彩评论

暂无评论...
验证码 换一张
取 消