Are distributed systems a completely independent concept compared to symmetric multiprocessing (since in distributed, we h开发者_JAVA技巧ave individual memory/disk storage per CPU whereas in symmetric we have many CPU's utilizing the same memory/disk storage)?
I wouldn't say they are completely different concepts, because you can get a shared memory in distributed systems (using Distributed shared memory), and multiple processes running on the same machine don't share their address space. So both environments can exists on both architectures, but with a cost. In general, shared memory is easier to program but harder to build (from the hardware point of view), and distributed systems are harder to program but easier to build.
So, the different concepts are actually shared memory and non-shared memory, at least from the programming point of view.
Distributed Computing and SMP are not the same, although DC might use SMP. DC is a way of how to parallelize independant workload-data to heterogenous and loosely coupled different systems.
A SMP system is a machine with tightly coupled CPUs and memory, benefiting from low-latency memory-access and sharing data amongst CPUs while the computations happen.
Example for distributed computing:
Einstein@Home is a project trying to find gravitational waves from experimental data gathered from huge Laser interferometers. The data to be crunched is pretty independent, so distributing the data to several different machines is no problem.
- Storage: Shared storage not needed.
- Shared Memory: Not needed, since the FFT routines used to find the desired results work on independant data chunks.
- Workload distribution: Is done over a pool of heterogenous machines.
Example for Symmetric Multiprocessing:
Running computations on large tables/matrices needs a certain proximity of the computing nodes ("CPUs"/"DC-nodes") to be able to finish the computation. If the result of a computation depends on the result of a "neighboring" node, the Distributed Computing paradigm wouldn't help you much.
- Storage: Should be shared and be accessible as fast as possible
- Shared Memory: Needed to exchange interim results
- Workload distribution: Takes place in a for-loop compound; the programmer has to care to design his loops in a way, that related computations happen at almost the same time
Hope that helps... Alex.
精彩评论