Should I create multiple OpenCL kernels to avoid conditional statements?_问答_开发者

Should I create multiple OpenCL kernels to avoid conditional statements?

开发者 https://www.devze.com 2023-01-24 05:10 出处：网络

In OpenCL, I have a kernel that needs to operate on complex and real data. I could put a conditional statement in that calls the right line of code to handle this, or I could have two kernels that I c

This obviously is bad for maintainability, but is it significant for performance?

If it's just one conditional statement, in my experience the performance difference is absolutely negligible, at least on NVidia hardware.

Basically, as long as all (or most) work-items follow the same code path, you're fine. As the code path taken depends on a kernel argument in your case, all work-items follow the same path.

Depends slightly on where the conditional is. Code for readability first, then performance after you have measured it AND found that it's a problem

eg. kernel_for_RGB_image and kernel_for_ABGR_image seems like a reasonable use, different kernels to effectively unroll some deep inner loop might be a bigger maintenance headache.

I think that the best way is to actually try and benchmark two variants. In some cases having multiple conditional blocks compiled, even if only one of them is executed, can lead to worse performance. The reason is GPRs (general purpose registers): the compiler allocates as many registers, as are needed for worst case.

I can suggest such a solution: have a single kernel function, but with compile-time conditional:

__kernel void work()
{
#if VAR
    // one code
#else
    // another code
#endif
}

Then you need to recompile the kernel with true/false set to VAR when changing condition. Obviously, for the compiler it doesn't differ from two kernels, but for maintaining may be better, if a part of code is the same for those kernels.