I am trying to convert a process that loops through a ~12,000x12,000 cell matrix (around 125 times) to use parallel processing (via parallel_for
). The code I am using is below. You can see where the for loop is commented out.
When I run this code with the for loop, there are no problems. When I run it (in debug) using parallel_for
, it crashes at random points with "Unhandled exception at 0x00f3d4ae in FratarProcess.exe 0xC0000005: Access violation writing location 0x0000000.
Note: accessMatrix
is declared as vector <vector <unsigned short> > accessMatrix;
and is filled prior to this point.
void dumpMatrix(unsigned short m)
{
int complete=0, start=2532, todo=accessMatrix.size()-start;
vector <string> sqlStrings;
Concurrency::parallel_for(start, (int)accessMatrix.size(),[&complete,&todo,&m,&sqlStrings](int i)
//for(int i=start;i<accessMatrix.size();i++)
{
printf("Processing i=%i... completed %i/%i\n",i,complete,todo);
for(unsigned short j=1;j<accessMatrix[i].size();j++)
{
if(accessMatrix[i][j]>0)
{
stringstream strSQL;
strSQL << "INSERT INTO debug.dbf (I,J,M,V) VALUES(" << i << "," << j << "," << m << "," << accessMatrix[i][j] << ")";
sqlStrings.push_back(strSQL.str());
}
}
complete++;
});
...
}
Can someone get me pointed in the right direction so I can get this process using all 8 cores of my machine instead of one? Note that I am somewhat of a n开发者_开发问答ovice at C++. I am using Visual C++ Express.
You've used no synchronization protection for sqlStrings. It is not safe to mutate the container, print to output, or even increment a shared variable from multiple threads concurrently without using synchronization.
This will also solve the problem:
Declare a combinable
object:
Concurrency::combinable<vector <string>> sqlStringsCombinable;
And in the loop:
sqlStringsCombinable.local().push_back(strSQL.str());
After the loop, combine them:
sqlStringsCombinable.combine_each([&sqlStrings](const std::vector<CString>& vec)
{
std::copy(vec.cbegin(), vec.cend(), back_inserter(sqlStrings));
});
And this will speed up parallel_for
as opposed to manually synchronizing the loop.
You can also use the Concurrent_Vector class.
- Add
#include<concurrent_vector.h>
. - then add
using namespace concurrency;
. - Then you can initialize the vector like so
concurrent_vector<pair<int,int>> xy_coords;
.
The usage of this vector is just like {std::vector} class.
精彩评论