开发者

trouble calculating offset index into 3D array

开发者 https://www.devze.com 2023-02-07 23:55 出处:网络
I am writing a 开发者_如何学编程CUDA kernel to create a 3x3 covariance matrix for each location in the rows*cols main matrix. So that 3D matrix is rows*cols*9 in size, which i allocated in a single ma

I am writing a 开发者_如何学编程CUDA kernel to create a 3x3 covariance matrix for each location in the rows*cols main matrix. So that 3D matrix is rows*cols*9 in size, which i allocated in a single malloc accordingly. I need to access this in a single index value

the 9 values of the 3x3 covariance matrix get their values set according to the appropriate row r and column c from some other 2D arrays.

In other words - I need to calculate the appropriate index to access the 9 elements of the 3x3 covariance matrix, as well as the row and column offset of the 2D matrices that are inputs to the value, as well as the appropriate index for the storage array.

i have tried to simplify it down to the following:

   //I am calling this kernel with 1D blocks who are 512 cols x 1row. TILE_WIDTH=512
   int bx = blockIdx.x;
   int by = blockIdx.y;
   int tx = threadIdx.x;
   int ty = threadIdx.y;
   int r = by + ty; 
   int c = bx*TILE_WIDTH + tx;
   int offset = r*cols+c; 
   int ndx = r*cols*rows + c*cols;


   if((r < rows) && (c < cols)){ //this IF statement is trying to avoid the case where a threadblock went bigger than my original array..not sure if correct

      d_cov[ndx + 0] = otherArray[offset];//otherArray just contains a value that I might do some operations on to set each of the ndx0-ndx9 values in d_cov
      d_cov[ndx + 1] = otherArray[offset];
      d_cov[ndx + 2] = otherArray[offset];
      d_cov[ndx + 3] = otherArray[offset];
      d_cov[ndx + 4] = otherArray[offset];
      d_cov[ndx + 5] = otherArray[offset];  
      d_cov[ndx + 6] = otherArray[offset];
      d_cov[ndx + 7] = otherArray[offset];   
      d_cov[ndx + 8] = otherArray[offset];  
   }

When I check this array with the values calculated on the CPU, which loops over i=rows, j=cols, k = 1..9

The results do not match up.

in other words d_cov[i*rows*cols + j*cols + k] != correctAnswer[i][j][k]

Can anyone give me any tips on how to sovle this problem? Is it an indexing problem, or some other logic error?


Rather than the answer (which I haven't stared hard enough to find), here's the technique I usually use for debugging these sorts of issues. First, set all values in your destination array to NaN. (You can do this via cudaMemset -- set every byte to 0xFF.) Then try uniformly setting every location to the value of the row, then inspect the results. In theory, it should look something like:

0 0 0 ... 0
1 1 1 ... 1
. . . .   .
. . .  .  .
. . .   . .
n n n ... n

If you see NaNs, you've failed to write to an element; if you see row elements out of place, something is wrong, and they'll usually be out of place in a suggestive pattern. Do something similar with the column value, and with the plane. Usually, this trick helps me find part of the index calculation is awry, which is most of the battle. Hope that helps.


I might be just stupid, but what is the logic in this line?

int ndx = r*cols*rows + c*cols;

Shouldn't you have

int ndx = offset*9;

If you said that the size of your covariance array was rows*cols*9, then wouldn't offset*9 take you at the same location in the 3D covariance array as where you are in your input array. So then offset*9+0 would be the location (0,0) of the 3x3 covariance matrix of the element at offset, offset*9+1 would be (0,1), offset*9+2 would be (0,2), offset*9+3 would be (1,0) and so on until offset*9+8.

0

精彩评论

暂无评论...
验证码 换一张
取 消