How can I create global variables in CUDA?
__device__ float *devD;
cudaMalloc((void**)&devD, s);
calculateDT_T2B<<<dimGrid, dimBlock>>>();
cudaMemcpy(dtr, devD, s, cudaMemcpyDeviceToHost);
print(dtr);
It doesnot give the correct answer (gives some random numbers). But when I call
calculateDT_T2B<<<dimGrid, dimBlock>>>(devD); instead of
calculateDT_T2B<<<dimGrid, dimBlock>>>();
It gives the correct answer.. why?
You cannot directly use cudaMalloc
to allocate onto a __device__
symbol in GPU memory. When you do so, you are allocating only in host memory. See my answer to your own, almost identical question which you posted within a minute of this one. The short version is to use cudaMemcpyToSymbol to write a dynamically allocated device pointer onto a statically declared symbol.
精彩评论