开发者

lock free containers and visibility

开发者 https://www.devze.com 2023-04-09 04:38 出处:网络
I\'ve seen some lock free implementations of stack... My question is regarding the visibility, not atomicity. For example do elements(not pointers) of lock free stack must be at most 64bit? I think so

I've seen some lock free implementations of stack... My question is regarding the visibility, not atomicity. For example do elements(not pointers) of lock free stack must be at most 64bit? I think so, because you cant guarante visibility. Real example: can this struct be safely inserted and removed from lock free container

struct person
{
   string name;
   uint32_t age;
}

EDIT: some people are confused by the开发者_运维百科 question. To explain a bit: if writer pushes person on stack, reader gets it, is it guaranteed that reader sees(memory visibility) correct content of the person.


I may be wrong, but I think the question is incorrect.

Atomic instructions deal typically with single pointer length data; at most, with two pointer lengths worth of data.

A typical structure cannot be atomically manipulated because it is too large.

So the lock-free stack will and only will be manipulating pointers to elements (which AFAIK need to be aligned on pointer length boundaries - I know of no platform where this is not the case).


I must admit to being slightly confused by the question myself...

Lock-free data-structures exist by manipulating pointers (of machine size & alignment) to nodes. Those nodes then contain the real "content" of your lock-free data-structure. That content can have any arbitrary size you want it to, so your struct could be put there. The usual node for lock-free data-structures looks something like:

struct Node { ATOMIC_PTR next; content; }

where content can be whatever you want it to be, a pointer to memory containing something or directly memory that contains something. Visibility is a non-issue here, since when modifying your lock-free data-structure, you'd first allocate a new node, then populate it with the desired content, and finally use atomic operations to set the various involved pointers right. Since that is the last thing you do, and those operations usually involve memory barriers themselves to guarantee visiblity and ordering, you're fine.


According to other QnA's on SO,

  • Does Interlocked.CompareExchange use a memory barrier?
  • question about lock free queues

every x86 interlocked instruction (including CAS) imply a full memory barrier. So, at least on x86, we don't need to care about visibility of elements of lock-free containers (assuming that they use CAS.)


Yes the structure could be used. Because all you need for a lock-free data structure is some way of atomically updating a single value that represents the innards of the structure. The size of the element or payload would not have any impact on its lock-free nature.

As I understand it, a lock-free data structure would work like so:

  1. Copy the data
  2. Modify the data
  3. Atomically check that the original object has been unmodified and update it
  4. Otherwise repeat from the beginning

So as long as the third step can be performed atomically all is well.

Making the elements themselves atomically updatable wouldn't give you any benefit since the container has to manage them collectively as a group.


Note: Please mark this answer as correct only if you actually test this approach.

About your question whether the below struct be safely inserted and removed from lock free container:

struct person
{
   string name;
   uint32_t age;
}

Multi-byte sequences of any length can be safely inserted/removed from lock free container if you use a redundant encoding. Let us assume that we already have atomic instructions for manipulating 4 bytes at a time (32 bits). In that case, we can encode the uint32_t age field like so:

struct age_t
{
   uint32_t age_low;
   uint32_t age_high;
}

The field age_low stores the low bits (for example, the low 16 bits) of the 32-bit uint32_t age. The field age_high stores the remaining high bits. Conceptually:

struct age_t
{
   uint16_t age_low;
   uint16_t id_low;
   uint16_t age_high;
   uint16_t id_high;
}

The fields id_low and id_high should contain an ID identifying the writer.

A read is implemented as two atomic 32-bit reads, and succeeds if all the id_ parts are equivalent to each other. If it fails, the read operation needs to be restarted.

A write is implemented as two atomic 32-bit writes, and is followed by a read of the whole age_t value. The write succeeds if: the read mentioned in the previous sentence succeeds and the IDs which were read are equivalent to the written IDs.

About the string value: the principle is the same. You just need to figure out how to split its binary value similarly to how age value was split. Same in respect to reading/writing the whole person structure.


Thread safe containers (lockfree or with locks for that matter) solve the thread safety of the list/container not the thread safety of the items you put into the container. Thus the lock-free stack will make sure the push and pop are threadsafe and lock-free but if you have two different threads that holds pointers to the same struct instance (e.g. the thread that pushed into the stack still holds the ponter and another thread pops the stack) you'd have to use some other thread-safety measure to ensure the structs consistency


One could implement a thread-safe implementation of a dequeue for arbitrary-sized data elements if the elements did not have to be stored within the stack or queue in any particular order, and if one used a thread-safe queue to hold the indices of unallocated items, as well as a thread-safe dequeue to hold the indices of the data slots which hold the enqueued/stacked items. Writing an item to the dequeue would entail pulling a number from the "unallocated slots" queue, writing the desired data to that slot, and then enqueueing the index of that number in the "main" dequeue. Fetching an item would require pulling its number from the "main" dequeue, copying it elsewhere, and then enqueueing that number in the "unallocated slots" queue.

One caveat with this approach is that it while it may be "lock-free" insofar as a thread which stalls cannot arbitrarily delay the progress of other threads, a thread which gets waylaid between the time it acquires a slot index from one queue and the time it stores it in another queue may render an array slot unusable for an arbitrary length of time. By contrast, some wait-free stack or queue implementations used with smaller data types don't have that limitation. A thread could vanish from existence during a read or write, and the system would either be in a valid state representing that the read or write completed, or else in a valid state indicating it was never begun.

0

精彩评论

暂无评论...
验证码 换一张
取 消