Double checked locking on C++: new to a temp pointer, then assign it to instance_问答_开发者

Anything wrong with the following Singleton implementation?

Foo& Instance() {
    if (foo) {
        return *foo;
    }
    else {
        scoped_lock lock(mutex);

        if (foo) {
            return *foo;
        }
        else {
            // Don't do foo = new Foo;
            // because that line *may* be a 2-step 
            // process comprising (not necessaril开发者_JAVA技巧y in order)
            // 1) allocating memory, and 
            // 2) actually constructing foo at that mem location.
            // If 1) happens before 2) and another thread
            // checks the foo pointer just before 2) happens, that 
            // thread will see that foo is non-null, and may assume 
            // that it is already pointing to a a valid object.
            //
            // So, to fix the above problem, what about doing the following?

            Foo* p = new Foo;
            foo = p; // Assuming no compiler optimisation, can pointer 
                     // assignment be safely assumed to be atomic? 
                     // If so, on compilers that you know of, are there ways to 
                     // suppress optimisation for this line so that the compiler
                     // doesn't optimise it back to foo = new Foo;?
        }
    }
    return *foo;
}

No, you cannot even assume that foo = p; is atomic. It's possible that it might load 16 bits of a 32-bit pointer, then be swapped out before loading the rest.

If another thread sneaks in at that point and calls Instance(), you're toasted because your foo pointer is invalid.

For true security, you will have to protect the entire test-and-set mechanism, even though that means using mutexes even after the pointer is built. In other words (and I'm assuming that scoped_lock() will release the lock when it goes out of scope here (I have little experience with Boost)), something like:

Foo& Instance() {
    scoped_lock lock(mutex);
    if (foo != 0)
        foo = new Foo();
    return *foo;
}

If you don't want a mutex (for performance reasons, presumably), an option I've used in the past is to build all singletons before threading starts.

In other words, assuming you have that control (you may not), simply create an instance of each singleton in main before kicking off the other threads. Then don't use a mutex at all. You won't have threading problems at that point and you can just use the canonical don't-care-about-threads-at-all version:

Foo& Instance() {
    if (foo != 0)
        foo = new Foo();
    return *foo;
}

And, yes, this does make your code more dangerous to people who couldn't be bothered to read your API docs but (IMNSHO) they deserve everything they get :-)

Why not keep it simple?

Foo& Instance()
{
    scoped_lock lock(mutex);

    static Foo instance;
    return instance;
}

Edit: In C++11 where threads is introduced into the language. The following is thread safe. The language guarantees that instance is only initialized once and in a thread safe manor.

Foo& Instance()
{
    static Foo instance;
    return instance;
}

So its lazily evaluated. Its thread safe. Its very simple. Win/Win/Win.

This depends on what threading library you're using. If you're using C++0x you can use atomic compare-and-swap operations and write barriers to guarantee that double-checked locking works. If you're working with POSIX threads or Windows threads, you can probably find a way to do it. The bigger question is why? Singletons, it turns out, are usually unnecessary.

the new operator in c++ always invovle 2-steps process :
1.) allocating memory identical to simple malloc
2.) invoke constructor for given data type

Foo* p = new Foo;
foo = p;

the code above will make the singleton creation into 3 step, which is even vulnerable to problem you trying to solve.

Why don't you just use a real mutex ensuring that only one thread will attempt to create foo?

Foo& Instance() {
    if (!foo) {
        pthread_mutex_lock(&lock);
        if (!foo) {
            Foo *p = new Foo;
            foo = p;
        }
        pthread_mutex_unlock(&lock);
    }
    return *foo;
}

This is a test-and-test-and-set lock with free readers. Replace the above with a reader-writer lock if you want reads to be guaranteed safe in a non-atomic-replacement environment.

edit: if you really want free readers, you can write foo first, and then write a flag variable fooCreated = 1. Checking fooCreated != 0 is safe; if fooCreated != 0, then foo is initialized.

Foo& Instance() {
    if (!fooCreated) {
        pthread_mutex_lock(&lock);
        if (!fooCreated) {
            foo = new Foo;
            fooCreated = 1;
        }
        pthread_mutex_unlock(&lock);
    }
    return *foo;
}

It has nothing wrong with your code. After the scoped_lock, there will be only one thread in that section, so the first thread that enters will initialize foo and return, and then second thread(if any) enters, it will return immediately because foo is not null anymore.

EDIT: Pasted the simplified code.

Foo& Instance() {
  if (!foo) {
    scoped_lock lock(mutex);
    // only one thread can enter here
    if (!foo)
        foo = new Foo;
  }
  return *foo;
}

Thanks for all your input. After consulting Joe Duffy's excellent book, "Concurrent Programming on Windows", I am now thinking that I should be using the code below. It's largely the code from his book, except for some renames and the InterlockedXXX line. The following implementation uses:

volatile keyword on both the temp and "actual" pointers to protect against re-ordering from the compiler.
InterlockedCompareExchangePointer to protect against reordering from the CPU.

So, that should be pretty safe (... right?):

template <typename T>
class LazyInit {
public:
    typedef T* (*Factory)();
    LazyInit(Factory f = 0) 
        : factory_(f)
        , singleton_(0)
    {
        ::InitializeCriticalSection(&cs_);
    }

    T& get() {
        if (!singleton_) {
            ::EnterCriticalSection(&cs_);
            if (!singleton_) {
                T* volatile p = factory_();
                // Joe uses _WriterBarrier(); then singleton_ = p;
                // But I thought better to make singleton_ = p atomic (as I understand, 
                // on Windows, pointer assignments are atomic ONLY if they are aligned)
                // In addition, the MSDN docs say that InterlockedCompareExchangePointer
                // sets up a full memory barrier.
                ::InterlockedCompareExchangePointer((PVOID volatile*)&singleton_, p, 0);
            }
            ::LeaveCriticalSection(&cs_);
        }
        #if SUPPORT_IA64
        _ReadBarrier();
        #endif
        return *singleton_;
    }

    virtual ~LazyInit() {
        ::DeleteCriticalSection(&cs_);
    }
private:
    CRITICAL_SECTION cs_;
    Factory factory_;
    T* volatile singleton_;
};