开发者

How should I handle thread synchronisation in a shared tree data structure?

开发者 https://www.devze.com 2023-01-27 08:58 出处:网络
This is just for a kind of concurrency refresher... Imagine I have a B+ tree data structure in memory - multiple items per node, only leaf nodes contain items, leaf nodes also form a linked list for

This is just for a kind of concurrency refresher...

Imagine I have a B+ tree data structure in memory - multiple items per node, only leaf nodes contain items, leaf nodes also form a linked list for easy sequential access. Inserts and deletes mostly only affect a leaf node, but can cause nodes to split or merge in a process that may propagate to the root.

I have a single-thread implementation, and the updates follow a kind of pre-planning approach. A recursion steps up the tree from leaf level as far as nodes need to change, building a linked list (linking local variable in different recursions) that describes the changes needed. When it knows what's needed, it can check whether it can allocate all needed nodes, and apply all needed changes (or not) by referencing this plan before falling out 开发者_如何学Cof the recursion.

This implementation also "maintains" iterators on updates, so iterators aren't invalidated by inserts/deletes unless the specific item they point to is deleted. Inserts/deletes within the same node cause the iterators pointing into that node to be updated.

Trouble is, I need to make it multithreaded - supporting potentially many readers and writers at once.

I want multiple readers to be able to read and write at the same time, so long as there is no risk of corruption as a result. So for reading, I don't want mutually exclusive access at all, even to a single node. For writing, I want to lock the minimum number of nodes needed for the change. And I want to avoid deadlock, of course.

Thankfully, it isn't something I actually need to do - but since I've neglected my concurrency skills, this just seems like a good thought experiment.

This is obviously similar to the kinds of problems that databases and filesystems have to handle, so I'm guessing I might get some references to that kind of thing, which would be great.

So - how would I handle the thread synchronisation for this? I can vaguely see a role for mutexes and/or semaphores on nodes, but what strategies would I use to work with them?


Definitely challenging task! I see that you c++ programmer, however I believe that in c++ there are similar concepts as in java and I'll try to help from java standpoint.

So for reading, I don't want mutually exclusive access at all, even to a single node

You could use ReadWriteLock. It be held simultaneously by multiple reader threads, so long as there are no writers. The write lock is exclusive. You just have to use exclusive access when doing writing. Do you have analogue in c++?

And I want to avoid deadlock, of course.

Just lock multiple nodes in order of levels (eg from top to bottom). That will guarantee you protection from deadlocks(that would be smth similar to Lamport's Bakery Algorithm).

As for databases - they resolve deadlocks by killing one process :-).

One more strategy is to implement unblocking tree structure in the similar manner how Cliff Click implemented unblocking hash map(state machine with all cases covered): video

Cheers

0

精彩评论

暂无评论...
验证码 换一张
取 消