开发者

Is it expected that use of boost::thread_specific_ptr<>::get() be slow? Any work arounds?

开发者 https://www.devze.com 2023-02-18 07:43 出处:网络
I\'m currently profiling an application with performance problems using Valgrind\'s \"Callgrind\". In looking at the profiling data, it appears that a good 25% of processing time is being spent inside

I'm currently profiling an application with performance problems using Valgrind's "Callgrind". In looking at the profiling data, it appears that a good 25% of processing time is being spent inside of boost::detail::get_tss_data in an application whose primary purpose is physics simulation and visualization.

get_tss_data is 开发者_JAVA技巧apparently called by thread_specific_ptr::get

Does anyone see this as expected? Does it generally imply something else specific?

Edit:

My platform is: Linux-2.6.32, x86, GCC 4.4.3, libc6-2.11.1/libpthread-2.11.1


thread_specific_ptr uses pthread_setspecific/pthread_getspecific for POSIX systems which is not the fastest possible.

If you are on a POSIX system, you can use the __thread storage specifier. However, it can only be used with initializers that are constant expressions e.g gcc's __thread

For Windows, a similar specifier is _declspec(thread).


Obtaining thread local data will most probably involve a system call. System calls jump to an interrupt vector as well as now having to read kernel memory. All this kills the cache.

For this reason reading thread local data can much longer than a normal variable read. For this reason is may well be a good idea to cache thread local data some local variable an not make frequent accesses to thread local storage.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号