How does the lock statement ensure intra processor synchronization?_问答_开发者

How does the lock statement ensure intra processor synchronization?

开发者 https://www.devze.com 2023-03-05 13:53 出处：网络

I have a small test application that executes two threads simultaneously. One increments a static long _value, the other one decrements it. I\'ve ensured with ProcessThread.ProcessorAffinity that the

I have a small test application that executes two threads simultaneously. One increments a static long _value, the other one decrements it. I've ensured with ProcessThread.ProcessorAffinity that the threads are associated with dif开发者_如何学运维ferent physical (no HT) cores to force intra processor communication and I have ensured that they overlap in execution time for a significant amount of time.

Of course, the following does not lead to zero:

for (long i = 0; i < 10000000; i++)
{
    _value += offset;
}

So, the logical conclusion would be to:

for (long i = 0; i < 10000000; i++)
{
    Interlocked.Add(ref _value, offset);
}

Which of course leads to zero.

However, the following also leads to zero:

for (long i = 0; i < 10000000; i++)
{
    lock (_syncRoot)
    {
        _value += offset;
    }
}

Of course, the lock statement ensures that the reads and writes are not reordered because it employs a full fence. However, I cannot find any information concerning synchronization of processor caches. If there wouldn't be any cache synchronization, I'd think I should be seeing deviation from 0 after both threads were finished?

Can someone explain to me how lock/Monitor.Enter/Exit ensures that processor caches (L1/L2 caches) are synchronized?

Cache coherence in this case does not depend on lock. If you use lock statement it ensures that your assembler commands are not mixed. a += b is not an atomic to processor, it looks like:

Load data into register from memory
Increment data
Store data back

And without lock it may be:

Load data into register X from memory
Load data into register Y from memory
Increment data (in X)
Decrement data (in Y)
Store data back (from X)
Store data back (from Y) // In this case increment is lost.

But it's not about cache coherence, it's a more high-level feature.

So, lock does not ensures that the caches are synchronized. Cache synchronization is a processor internal feature which does not depend on code. You can read about it here.

When one core writes a value to memory and then when the second core try to read that value it won't have the actual copy in its cache unless its cache entry is invalidated so a cache miss occurs. And this cache miss forces cache entry to be updated to actual value.

The CLR memory model guarantees (requires) that loads/stores can't cross a fence. It's up to the CLR implementers to enforce this on real hardware, which they do. However, this is based on the advertised / understood behavior of the hardware, which can be wrong.

The lock keyword is just syntactic sugar for a pair of System.Threading.Monitor.Enter() and System.Threading.Monitor.Exit() calls. The implementations of Monitor.Enter() and Monitor.Exit() put up a memory fence which entails performing architecture appropriate cache flushing. So your other thread won't proceed until it can see the stores that results from the execution of the locked section.