I'm using joda due to it's good reputation regarding multi threading. It goes great distances to make multi threaded date handling efficient, for example by making all Date/Time/DateTime objects immutable.
But here's a situation where I'm not sure if Joda is really doing the right thing. It probably does, but I'm very interested to see an explanation.
When a toString() of a DateTime is being called Joda does the following:
/* org.joda.time.base.AbstractInstant */
public String toString() {
return ISODateTimeFormat.dateTime().print(this);
}
All formatter开发者_如何学运维s are thread safe (they immutable as well), but what's about the formatter-factory:
private static DateTimeFormatter dt;
/* org.joda.time.format.ISODateTimeFormat */
public static DateTimeFormatter dateTime() {
if (dt == null) {
dt = new DateTimeFormatterBuilder()
.append(date())
.append(tTime())
.toFormatter();
}
return dt;
}
This is a common pattern in single threaded applications but it is known to be error-prone in a multithreaded environment.
I see the following dangers:
- Race condition during null check --> worst case: two objects get created.
No Problem, as this is solely a helper object (unlike a normal singleton pattern situation), one gets saved in dt, the other is lost and will be garbage collected sooner or later.
- the static variable might point to a partially constructed object before the objec has been finished initialization
(before calling me crazy, read about a similar situation in this Wikipedia article.)
So how does Joda ensure that no partially created formatter gets published in this static variable?
Thanks for your explanations!
Reto
You said, that formatters are read-only. If they use only final fields (I didn't read a formatter source) then in 3-rd edition of Java Language specification they are guarded from partital object creation by "Final Field Semantics". I didn't check 2-nd JSL edition and not sure, if such initalization is correct in that edition.
Look at chapter 17.5 and 17.5.1 in JLS. I wll construct a "event chain" for required happens-before relation.
First of all, somewhere in the constructor there is a write to the final field in the formatter. It is the write w. When constructor completes, a "freeze" action took plase. Let's call it f. Somewhere later in the program order (after return from the constructor, maybe some other methods and return from toFormatter) there is a write to the dt field. Let's give this write a name a. This write (a) is after the freeze action (f) in the "program order" (order in the single-threaded execution) and thus f happens-before a (hb(f, a)) just by JLS definition. Phew, initialization complete... :)
Somewtimes later, in another thread, a call to dateTime().format occurs. At that time we need two reads. First of the two is read of the final variable in the formatter object. Let's call it r2 (to be consistent with the JLS). Second of the two is a read of the "this" for the Formatter. This occurs during call to the dateTime() method when the dt field is read. And let's call this read r1. What do we have now? Read r1 saw some write to the dt. I consider that that write was the action a from previous paragraph (only one thread wrote that field, just for simplicity). As r1 see the write a, then there is mc(a, r1) ("Memory chain" relation, first clause definition). Current thread did not initialized a formatter, reads it field in action r2 and sees an "address" of a formatter read at the action r1. Thus, by definition, there is a dereferences(r1, r2) (another action ordering from JLS).
We have write before freeze, hb(w, f). We have freeze before assignment of the dt, hb(f, a). We have a read from dt, mc(a, r1). And we have a dereference chain between r1 and r2, dereferences(r1, r2). All this leads to a happens-before relation hb(w, r2) just by JLS definition. Also, by definition, hb(d, w) where d is a write of the default value for the final field in the object. Thus, read r2 can not see write w and must see write r2 (the only write to the field from a program code).
Same is the order for more indirect field access (final field of the object stored in the final field, etc...).
But that is not all! There is no access to a partialy constructed object. But there is a more interesting bug. In the lack of any explicit synhronization dateTime() may return null. I don't think that such behaviour can be observed in practice, but JLS 3-rd edition does not prevent such behaviour. First read of dt field in the method may see a value initialized by another thread, but second read of dt can see a "write of defalut value". No happens-before relations exists to prevent it. Such possible behaviour is specific to the 3-rd edition, second edition have a "write to main memory"/"read from main memory" which does not allow a thread to see a values of variable going back in time.
This is a bit of a non-answer, but the simplest explanation for
So how does Joda ensure that not partially created formatter gets published in this static variable?
might just be that they don't ensure anything, and either the developers didn't realize it could be a bug or felt it wasn't worth synchronizing over.
I asked a similar question on the Joda mailing list in 2007, though I didn't find the answers to be conclusive and I avoided Joda time as a result, for better or worse.
Version 3 of the Java Language Spec guarantees that object reference updates are atomic, regardless of whether they're 32-bit or 64-bit. This, combined with the arguments outlined above, makes the Joda code thread-safe IMO (see java.sun.com/docs/books/jls/third_edition/html/memory.html#17.7)
IIRC, version 2 of the JLS did not include the same explicit clarification about object references, i.e. only 32-bit refs were guaranteed atomic, so if you were using a 64 bit JVM there was no guarantee it would work. At the time I was using Java 1.4, which pre-dated JLS v3.
IMO the worst case is not two objects getting created but several (as many as there are threads calling dateTime()
, to be precise). Since dateTime()
is not synchronized and dt
is neither final nor volatile, a change to its value in one thread is not guaranteed to be visible to other threads. So even after one thread initialized dt
, any number of other threads may still see the reference as null, thus happily create new objects.
Other than that, as explained by others, a partially created object can not get published by dateTime()
. Neither can a partially changed ( = dangling) reference, since reference value updates are guaranteed to be atomic.
精彩评论