开发者

Running on multiple cores using MPI

开发者 https://www.devze.com 2023-03-06 21:08 出处:网络
I use the present command to submit MPI jobs: mpirun -np no.of processors filename My understanding is that the above command lets me submit to 4 independent processors that communicate via MPI. Howe

I use the present command to submit MPI jobs: mpirun -np no.of processors filename

My understanding is that the above command lets me submit to 4 independent processors that communicate via MPI. However, at our setup, each processor has 4 cores which go un-utilized . The questions I had are the following:

  1. Is it possible to submit a job to run on multiple cores on the same node or several nodes from the MPI run command line? If so how?

  2. Does the above require any special comments/set up within the code? I do understand from reading some lit开发者_运维百科erature that the communication time between cores could be different from between processors, so it does require some thinking about how the problem is distributed...but for that issue? What else does one need to estimate for?

  3. Finally, is there a limit on how much amount of data is transferred? Is there a limit on how much data the bus can send/receive? Is there a limitation on the cache?

Thanks!


So 1 is a question about launching processes, and 2+3 are questions about, basically, performance tuning. Performance tuning can involve substantial work on the underlying code, but you won't need to modify a line of code to do any of this.

What I understand from your first question is that you want to modify the distribution of the MPI processes launched. Doing this is necessarily outside the standard, because it's OS and platform dependant; so each MPI implementation will have a different way to do this. Recent versions of OpenMPI and MPICH2 allow you to specify where the processors end up, so you can specify two processors per socket, etc.

You do not need to modify the code for this to work, but there are performance issues depending on core distributions. It's hard to say much about this in general, because it depends on your communication patterns, but yes, the "closer" the processors are, the faster the communications will be, by and large.

There's no specified limit to the total volume of data that goes back and forth between MPI tasks, but yes, there are bandwidth limits (and there are limits per message). The cache size is whatever it is.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号