开发者

Any way to interrupt, kill, or otherwise unwind (releasing synchronization locks) a single deadlocked Java thread allowing other threads to continue?

开发者 https://www.devze.com 2022-12-29 14:22 出处:网络
I have a long-running process where, due to a bug, a trivial/expendable thread is deadlocked with a thread which I would like to continue, so that it can perform some final reportin开发者_开发百科g th

I have a long-running process where, due to a bug, a trivial/expendable thread is deadlocked with a thread which I would like to continue, so that it can perform some final reportin开发者_开发百科g that would be hard to reproduce in another way.

Of course, fixing the bug for future runs is the proper ultimate resolution. Of course, any such forced interrupt/kill/stop of any thread is inherently unsafe and likely to cause other unpredictable inconsistencies. (I'm familiar with all the standard warnings and the reasons for them.)

But still, since the only alternative is to kill the JVM process and go through a more lengthy procedure which would result in a less-complete final report, messy/deprecated/dangerous/risky/one-time techniques are exactly what I'd like to try.

The JVM is Sun's 1.6.0_16 64-bit on Ubuntu, and the expendable thread is waiting-to-lock an object monitor.

Can an OS signal directed to an exact thread create an InterruptedException in the expendable thread?

Could attaching with gdb, and directly tampering with JVM data or calling JVM procedures allow a forced-release of the object monitor held by the expendable thread?

Would a Thread.interrupt() from another thread generate a InterruptedException from the waiting-to-lock frame? (With some effort, I can inject an arbitrary beanshell script into the running system.)

Can the deprecated Thread.stop() be sent via JMX or any other remote-injection method?

Any ideas appreciated, the more 'dangerous', the better! And, if your suggestion has worked in personal experience in a similar situation, the best!


Can an OS signal directed to an exact thread create an InterruptedException in the expendable thread?

No.

Could attaching with gdb, and directly tampering with JVM data or calling JVM procedures allow a forced-release of the object monitor held by the expendable thread?

In theory Yes. In practice, you would need to a deep understanding of the internals of the JVM to have any chance of succeeding. So, realistically No.

Would a Thread.interrupt() from another thread generate a InterruptedException from the waiting-to-lock frame? (With some effort, I can inject an arbitrary beanshell script into the running system.)

In theory Yes. In practice the beanshell script would need to find the Thread object for the thread to be interrupted. That may involve traversing the tree of ThreadGroup objects, etc. Another issue is whether the interrupted thread is going to behave properly. For example, a lot of folks write their wait/notify code to catch / ignore InterruptedException and retry. If you've done that, the interrupt probably won't do any good.

Can the deprecated Thread.stop() be sent via JMX or any other remote-injection method?

If you can call Thread.interrupt() you can use the same approach to call Thread.stop(). Normally, I'd say don't do it. But in this situation it might be worth a try.

But the real lesson from all of this is that an application that can take days or weeks to produce an answer ought to implement a checkpoint / resume mechanism to deal with this kind of eventuality, and things like power failure, hardware failure, machine reboot, etc.


Forget it. In the very best case you could detect a deadlock through some watchdog timer, ignore the threads that are stuck, and create new threads to continue the work. Not very satisfying. You can't unlock the locks involved, and there are two locks being held (or more). You can't make the "expendable" thread release the lock it's holding.

There's a rather simple method to detect potential deadlocks: Assign a level from 1 upwards to each lock. Enforce the rule "While holding a lock, a thread must only acquire locks with a lower level". If you catch a violation of the rule, fix the numbering. If it can't be fixed, then you have a potential deadlock which could with bad luck turn into a real deadlock. Change your code.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号