In parallel computing theoretically super-linear speedup is not possible. But in practice we do see such cases. One reason is cache effect but I fail to understand what does it play. Also, there are other things in开发者_运维知识库volved but what are they? In summary,
How are super-linear speedups possible?
I'm a beginner with respect to parallel computing.
Suppose you have an 8 processor machine, each processor has a 1MB cache, and your computation uses 6MB of data.
On 1 processor the computation will be doing a lot of data movement between CPU, cache and RAM. On 8 processors the computation will only have to move data between CPU and cache. This way you can achieve super-linear speedup.
These figures and this analysis have been simplified for exposition for a beginner.
In short, superlinear speedup is achieved when the total amount of work processors do is strictly less than the total work performed by a single processor.
This can happen in three ways:
The original sequential algorithm was really bad, using the parallel version of the algorithm on one processor will usually do away with the superlinear speedup.
The parallel algorithm uses some search like a random walk, the more processors that are walking, the less distance has to be walked in total before you reach what you are looking for.
Modern processors have faster and slower memories. Typically it will try to keep the data you are using in the fast memory. We can safely say your amount of data is larger than the amount of fast memory. If you use n processors you have n times the amount of faster memory. More data fits in the fast memory which makes it possible to take less time (thus amount of work) on the same task.
精彩评论