OpenMP threads "disobey" omp barrier_问答_开发者

开发者 https://www.devze.com 2023-02-06 18:42 出处：网络

So here\'s the code: #pragma omp parallel private (myId) { set_affinity(); myId = omp_get_thread_num(); if (myId<myConstant)

相关专题：openmp

So here's the code:

#pragma omp parallel private (myId)
{
  set_affinity();

  myId = omp_get_thread_num(); 

  if (myId<myConstant)
  {
    #pragma omp for schedule(static,1)
    for(count = 0; count < AnotherConstant; count++)
      {
        //Do stuff, everything runs as it should
      }
  }

#pragma omp barrier //all threads wait as they should
#pragma omp single
 {
    //everything in here is executed by one thread as it should be
 }
   #pragma omp barrier //this is the barrier in which threads run ahead
   par_time(cc_time_tot, phi_time_tot, psi_time_tot);
   #pragma omp barrier
}
//do more stuff

Now to explain whats going on. At the start of my parallel region myId is set to private so that every thread has its correct thread id. set_affinity() controls which thread runs on which core. The issue I have involves the #pragma omp for schedule(static,1).

the block:

  if (myId<myConstant)
  {
    #pragma omp for schedule(static,1)
    for(count = 0; count < AnotherConstant; count++)
      {
        //Do stuff, everything runs as it should
      }
  }

Represents some work that I want to distribute over a certain number of threads, 0 through myConstant-1. On these threads I want to evenly (in the manner which schedule(static,1) does) distribute the iterations of the loop. This is all performed correctly.

Then the code enters a single region, all commands in there are performed as they should be. But 开发者_开发知识库say I specify myConstant to be 2. Then if I run with 3 threads or more, everything through the single material executes correctly, but threads with id 3 or greater do not wait for all the commands within the single to finish.

Within the single some functions are called that create tasks which are carried out by all threads. The threads with id of 3 or more (in general of myConstant or more) continue on, executing par_time() while the other threads are still carrying out tasks created by the functions executed in the single. par_time() just prints some data for each thread.

If I comment out the pragma omp for schedule(static,1) and just have a single thread execute the for loop (change if statement to if(myId==0) for instance), then everything works. So I'm just not sure why the aforementioned threads are continuing onwards.

Let me know if anything is confusing, it's kind of a specific issue. I was looking so see if anyone saw a flaw in my flow control with OMP.

If you look at the OpenMP V3.0 spec, section 2.5 Worksharing Constructs, states:

The following restrictions apply to worksharing constructs:

Each worksharing region must be encountered by all threads in a team or by none at all.

The sequence of worksharing regions and barrier regions encountered must be the same for every thread in a team.

By having the the worksharing for within the if, you have violated both of these restrictions making your program non-conforming. A non-conforming OpenMP program has "unspecified" behavior according to the specification.

As to which threads will be used to execute the for loop, with the schedule type of "static,1", the first chunk of work - in this case count=0 - will be assigned to thread 0. The next chunk (count=1) will be assigned to thread 1, etc. until all chunks are assigned. If there are more chunks than threads then assignment will restart at thread 0 in a round-robin fashion. You can read the exact wording in the OpenMP spec, section 2.5.1 Loop construct, under description where it talks about the schedule clause.