I'm writing a highly parallel application that's multithreaded. I've already got an SSE accelerated thr开发者_高级运维ead class written. If I were to write an MMX accelerated thread class, then run both at the same time (one SSE thread and one MMX thread per core) would the performance improve noticeably?
I would think that this setup would help hide memory latency, but I'd like to be sure before I start pouring time into it.
The SSE and MMX instruction sets share the same set of vector processing execution units in the CPU. Therefore, running an SSE thread and an MMX thread will have the same resources available each thread as if running two SSE threads (or two MMX threads). The only difference is in instructions which exist in SSE but not MMX (since SSE is an extension of MMX). But in that case the MMX is probably going to be slower because it doesn't have those more advanced instructions available to it.
So the answer is: No, you would not see a performance improvement compared to running two SSE threads.
SSE and MMX use the same registers, so it doesn't matter which of the two you use (apart from MMX sucking and SSE being useful, of course)
The better question is how SSE is implemented on your target CPU. Does it have a SSE unit per core? (probably) If so, then you might as well run SSE instructions on every thread.
If it has a shared SSE unit between cores then different threads will be fighting over it so there won't be much gained by executing SSE instructions in multiple threads. (I don't know if any CPUs actually share the SSE unit between threads though, so take this as a hypothetical case)
精彩评论