开发者

How do I atomically read a value in x86 ASM?

开发者 https://www.devze.com 2023-01-09 08:24 出处:网络
I know how to atomically write a value in x86 ASM. But how do I read one? The LOCK prefix ca开发者_StackOverflown\'t be used with mov.

I know how to atomically write a value in x86 ASM. But how do I read one? The LOCK prefix ca开发者_StackOverflown't be used with mov.

To increase a value, I am doing:

lock inc dword ptr Counter

How do I read Counter in a thread-safe way?


As I explain to you in this post:

Accesses to cacheable memory that are split across bus widths, cache lines, and page boundaries are not guaranteed to be atomic by the Intel Core 2 Duo, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, P6 family, Pentium, and Intel486 processors. The Intel Core 2 Duo, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, and P6 family processors provide bus control signals that permit external memory subsystems to make split accesses atomic; however, nonaligned data accesses will seriously impact the performance of the processor and should be avoided.

So use:

LOCK        CMPXCHG   EAX, [J]

LOCK CMPXCHG first fence cache memory and than compare the EAX with destination value, if destination value not equ then the result in EAX is destination value.

EDIT: LINKs to:

Intel® 64 and IA-32 Architectures Software Developer’s Manuals

In Volume 3A: System Programming Guide check section 8.1.1

Also check: Optimization Reference Manual section: CHAPTER 7 OPTIMIZING CACHE USAGE


I'm not an assembly expert, but word-sized (on x86, 32-bit) reads/writes should be atomic already.

The reason you need to lock the increment is because that's both a read AND a write.


For a simple read, it's mostly about alignment. The easiest way to assure atomic reading is to always use "natural" alignment -- i.e., the alignment is as least as great as the size of the item (e.g., 32-bit item is 32-bit aligned).

Misaligned reads aren't necessarily atomic. For an extreme example, consider reading a 32-bit value at an odd address where the first byte is in one cache line, and the other three bytes are in another cache line. In such a case, an atomic read is essentially impossible.

Since (at least most) processors use a 64-bit wide memory bus, the largest item that can hope to be read atomically is 64 bits.


It is interesting to read the other replies. I think @GJ is probably on the money.

For many years it was always true that 32-bit read and write was atomic. It is only in recent years with the really aggressive caching that this is no longer guaranteed.

I guess that's why I prefer C++, Java or some such between me and the machine code. These days the machine code is too complex to write reliably (unless you do it a lot to keep your skills sharp). Luckily, today's optimising compilers are so good that you seldom need the performance of hand-optimised assembler.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号