开发者

My java process's file descriptors going "bad" and I have no idea why

开发者 https://www.devze.com 2023-03-19 05:23 出处:网络
I have a java webapp, built with Lucene, and I keep getting various \"file already closed\" exceptions - depending on which Directory implementation I use.I\'ve been able to get \"java.io.IOException

I have a java webapp, built with Lucene, and I keep getting various "file already closed" exceptions - depending on which Directory implementation I use. I've been able to get "java.io.IOException Bad File Descriptor" and "java.nio.channels.ClosedChannelException" out of Lucene, usually wrapped around an AlreadyClosedException for the IndexReader.

The funny thing is, I haven't closed the IndexReader and it seems the file descriptors are going stale on their own. I'm using the latest version of Lucene 3.0 (haven't had time to开发者_JAVA技巧 upgrade out of the 3.0 series), the latest version of Oracle's JDK6, the latest version of Tomcat 6 and the latest version of CentOS. I can replicate the bug with the same software on other Linux systems, but not on Windows systems and I don't have an OSX PC to test with. The linux servers are virtualized with qEmu, if that could matter at all.

This seems to also be load related - how frequently this happens corresponds to the amount of requests/second that Tomcat is serving (to this particular webapp). For example, on one server every request completes as expected until it has to deal with ~2 reqs/sec, then about 10% start having their file descriptors closed from under them, mid-request (the code checks for a valid IndexReader object and creates one at the beginning of processing the request). Once it gets to about 3 reqs/sec, all of the requests start failing with bad file descriptors.

My best guess is that somehow there's resource starvation at an OS level and the OS is cleaning up fds... but that's simply because I've eliminated every other idea I've had. I've already checked the ulimits and the filesystem fd limits and the number of open descriptors is well below either limit (example output from sysctl fs.file-nr: 1020 0 203404, ulimit -n: 10240).

I'm almost completely out of things to test and I'm no closer to solving this than the day that I found out about it. Has anyone experienced anything similar?

EDIT 07/12/2011: I found an OSX machine to use for some testing and have confirmed that this happens on OSX. I've also done testing on physical Linux boxes and replicated the issue, so the only OS that I've been unable to replicate this issue with is Windows. I'm guessing this has something to do with POSIX handling of file descriptors because that seems to be the only relevant difference between the two test systems (JDK version, tomcat version and webapp were all identical across all platforms).


the reason you probably don't see this happening on Windows, might be that its FSDirectory.open defaults to using SimpleFSDirectory.

check out the warnings at the top of FSDirectory and NIOFSDirectory: the text in red at http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/store/NIOFSDirectory.html:

NOTE: Accessing this class either directly or indirectly from a thread while it's interrupted can close the underlying file descriptor immediately if at the same time the thread is blocked on IO. The file descriptor will remain closed and subsequent access to NIOFSDirectory will throw a ClosedChannelException. If your application uses either Thread.interrupt() or Future.cancel(boolean) you should use SimpleFSDirectory in favor of NIOFSDirectory

https://issues.apache.org/jira/browse/LUCENE-2239

0

精彩评论

暂无评论...
验证码 换一张
取 消