I have a multithreaded application, where each thread has a variable of integer type. These variables are incremented during execution of the program. At certain points in the code, a thread compares its counting variable with those of the other threads.
Now since, we know that threads running on multicore might execute out of order, a thread might not read the expected counter values of the other threads. To solve this problem, one way is to use atomic variable, such as std::atomic<> of C++11. However, performing a memory fence at each increment of counters will significantly slow down the program.
开发者_开发问答Now what I want to do is that when a thread is about to read other thread's counter, only then a memory fence is created and counters of all the threads are updated in the memory at that point. How can this be done in C++. I am using Linux and g++.
The C++11 standard library includes support for fences in <atomic>
with std::atomic_thread_fence
.
Calling this invokes a full fence:
std::atomic_thread_fence(std::memory_order_seq_cst);
If you want to emit only an acquire or only a release fence, you can use std:memory_order_acquire
and std::memory_order_release
instead.
There are x86 intrinsics that correspond to memory barriers that you can use yourself. The Windows header has a memory barrier macro, so you should be able to find something equivalent for Linux.
You can use boost::asio::strand for this exact purpose. Create a handler responsible for reading the counter. That handler can be called from multiple threads. Instead of directly calling the handler, wrap it inside a boost::asio::strand. This will ensure the handler can not be concurrently called by multiple threads.
http://www.boost.org/doc/libs/1_35_0/doc/html/boost_asio/tutorial/tuttimer5.html
I hope I understood the question right.
My suggestion would be to have a collectTimers() function in a higher level class that can ask each thread for its counter (via queue/msg). This way updating timers are not delayed, but collecting timers is a bit slower.
This only works if you have some kind of communication mechanism between the threads.
And why not having a "control" thread, to whom each thread reports its counter increments and ask for the values of others ?
It would make it very efficient and simple. Just a suggestion.
You could try something like the signal-theft limit counter design in Secion 4.4.3 of http://mirror.nexcess.net/kernel.org/linux/kernel/people/paulmck/perfbook/perfbook.2011.08.28a.pdf
This kind of design can eliminate the atomic operations from the fastpath (incrementing the per-thread counter). Whether the complexity is worth it is up to you to decide, of course.
精彩评论