In November 2009 I went to No Fluff Just Stuff. One of the presentations was by Brian Goetz, which was about Java concurrency. For some reason there were items on his agenda slide that were not covered in his presentation.
He went over some strategies, and at the end he pointed out a common tactic that he al开发者_开发问答so said is a good rule of thumb for concurrency in Java: Make your variables private, and make any methods that access them synchronized.
That sounds pretty simple. And perhaps too good to be true. Are there situations/applications where this concurrency technique would not be sufficient? Would relying primarily on this technique work well in systems with lots of transactions, or large data sets? What are the potential drawbacks of doing this?
Are there situations/applications where this concurrency technique would not be sufficient?
If you have an operation which you would like to be synchronized which spans several of these methods, then marking each individual method as synchronized is not sufficient.
Would relying primarily on this technique work well in systems with lots of transactions, or large data sets?
I doubt it, as the synchronization bounds for operations rarely seem to be this granular.
What are the potential drawbacks of doing this?
Synchronization is not free, there are costs involved (although usually minor) with attaining locks. Unnecessary synchronization means unnecessary costs.
Frankly, after reading Java Concurrency in Practice, I am really surprised that Goetz would give out this advice, as the main lessons of that book are that concurrency is a complicated topic, the solutions to make your application threadsafe and performant with concurrent access are highly customized to the specifics of the application, and there are no simple and quick answers.
Although new to concurrent programming, I don't think that there is a panacea when it comes to concurrency 'make all members private and the methods synchronized'.
Have a look at: http://download.oracle.com/javase/tutorial/essential/concurrency/index.html
Also, you might be forced to used synchronized methods when it comes to shared data (data that could be accessed and/or modified by multiple threads). In other cases you might not have to use the synchronized keyword.
Beyond performance implications, there are cases where this level of protection is insufficient. For example, setX( ), setY( ). If your API has those separately, you can end up with x1, y2 if one thread sets x1, y1 while another sets x2, y2. To avoid this, you need to lock the object before calling the setters, or redesign your API to support setXY(x, y).
It's absolutely not sufficient in many cases. Consider the case where you want to switch the position of two elements of a Vector (which basically follows the tactic Goetz described). Your code would consist of a series of calls to removeElement
and insertElementAt
(both of which are synchronized). If, however, the Vector object were modified by another thread in between these calls, the result could be utter garbage.
While Goetz is right that this is a common tactic, it is often the case that higher-level synchronization is needed. That's why ArrayList was introduced — it's basically a Vector without the (often unnecessary) overhead of method-level synchronization.
That seems like some serious overkill. It's also not going to suffice for an increment/get sequence, where another thread could potentially interleave a set() call.
If I ever ran into a codebase where every object had synchronized getter/setter methods, I'd have serious doubts about the coder - most likely I'd think they'd just recently read about synchronization and didn't completely understand it.
A clear example is given in Goetz's book referenced by [matt b].
Transfering funds from account A to B:
synchronized (a) {
synchronized (b) {
// do the transfer
}
}
Seems clear enough, but what happens if two threads simultaneously try to transfer $1 from A to B and the other $2 from B to A and the time slice happens between the synchronized blocks?
One solution is to order them and lock the smaller account number first. The book has more examples and solutions. So no, there's no easy answer, but the make private and synchronize member access will be sufficient for many applications' needs.
"Make your variables private, and make any methods that access them synchronized." - is reasonable advice in itself, and can make your code thread-safe in certain very basic circumstances.
However it definitely isn't sufficient to ensure correct concurrent operation in more complicated situations - e.g. where you need to modify two different objects at once in a single transaction. In general there is a problem that "locks don't compose" when you try to glue two or more locked/synchronised operations together.
If you are interested in concurrency and mutable state, I strongly recommend watching this presentation on "Value, Identity and State" by Rich Hickey. It talks about the design of Clojure's concurrency system, but the principles are important and applicable to any language.
The technique generates thread-safe code in that it protects your java variables from read/write interruptions from multiple threads. However, that doesn't mean that your code is correct. Sometimes synchronization applies to sequences of execution that aren't well represented by a single method. Imagine a routine that accepts a rectangle and sets the width and length equal to 5. Imagine another routine on a different thread that accepts a rectangle and sets the width equal to 3 and the length equal to 6. Even though setWidth and setLength are synchronized, thread one can set the width, thread two can set the width and the length, and thread one can set the length. Now the rectangle has width 5 and length 6. This is not correct according to either thread. Note, if the rectangle was immutable, this particular problem wouldn't happen.
Here's an example that comes up in larger systems. Imagine a distributed system where you need to synchronize two files on two machines. You need to obtain some sort of synchronization lock on each file. If many different threads are in contention for different files, you need a mechanism to establish who gets the lock. There are a variety of schemes for handling this, so it isn't an unknown problem, but you can see that it isn't just as simple as two private variables in a single object.
Now, your follow-up: what are the downsides? If you have an immutable resource, there's probably no need to protect against multiple reads by different threads. As a consequence, the additional overhead of the synchronization code is unnecessary. Your program, while correct, is slower than another correct program implemented with the same algorithms just because of unnecessary synchronization.
I somehow doubt he would recommend this method since it's really flawed and could cause more problems than it solves. But if he did recommend it the way you described it, he is wrong.
As others have already mentioned, this method does not provide atomicity (multiple operations are not performed atomically). This is a major issue.
Some other possible problems that come to mind:
- all members private and all methods synchronized? Including the setters and getters? That's just bad, bad, bad.
- how about those case where performance is really critical and your reads would block/serealize both writes and other reads? That's also bad, a better alternative would be to use something like ReentrantReadWriteLock.
- prone to deadlocks, see the example karmakaze provided
But the most dangerous thing about this is that one might get it as if it were a swiss army knife, when it clearly isn't. Useful to remember: synchronized was part of Java from the beginning, why bother with all the alternatives then? One thing must be clear: it's not only for performance reasons that we have all these other features nowadays:
- lock classes
- plenty concurrent classes
- proper volatile and a nice JMM as of Java 1.5
My recommendation? Try to learn as much as you can about concurrency in Java, there are lots of resources out there, Google (or any decent SE for that matter) is your best friend. Just be prepared to spend some time, it ain't easy, and it can get really deep if you really want to understand what's going on down the rabbit hole. The CPUs of today and the future with an ever increasing number of thread execution units [will] make concurrency skills "a must", so there's no reason to skip on this.
I am pretty sure I know which presentation you're thinking of, and you skipped quite a few steps.
The advice given in that presentation was: - Encapsulate your state. (Nothing controversial here). - To the extent you need to synchronized access to state, encapsulate that synchronization, which is to say that it is a bad call to export synchronization requirements on your clients.
To get from here to "make all variables that access them synchronized" is to skip over the analysis of which variables are shared and what the logical policy for synchronizing access to that state, and jump right to "synchronize everything."
精彩评论