Keeping unused variables in CUDA_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-24 05:14 出处：网络

I made some kernels for testing bandwidth and they do no useful computations. A minimal example is __global__ void testKernel(float* a)

相关专题：

I made some kernels for testing bandwidth and they do no useful computations. A minimal example is

__global__ void testKernel(float* a) 
{
    unsigned int i = blockIdx.x*blockDim.x + threadIdx.x;
    float x;
    x = a[i];
}

When I compile, I get (not surprisingly)

warning: variable "x" was set but never used

and the kernel runs as quickly as an empty kernel:

__global__ void donothing() 
{
}

This indicates that the read of a[i] has been optimized out.

I have tried tricks such as

volatile float x;

if(x);

(void)(x;)

and they suppress the warning, but the kernel still finishes too quickly.

How can I make sure that the useless instructions actually get executed?

I found the option CU_JIT_OPTIMIZATION_LEVEL but google provides mostly links to the documentation and not how to use it. Woul开发者_如何学Cd this option help me and how do I use it?

Try introducing a branch which stores the variable:

__global__ void testKernel(float* a, float *b) 
{
    unsigned int i = blockIdx.x*blockDim.x + threadIdx.x;
    float x;
    x = a[i];

    if(b)
    {
      *b = x;
    }
}

The cost of the branch compared to the cost of memory transfer is negligible.

At the kernel launch site, simply pass a null pointer:

testKernel<<<...>>>(a, static_cast<float*>(0));

nvcc will not perform constant folding at this granularity, so your load should not be removed because the compiler cannot prove it is useless.

Keeping unused variables in CUDA

精彩评论

关注公众号

热门标签

图文推荐

Keeping unused variables in CUDA

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：