开发者

OpenCL Alternative Modulo Uses, Advice

开发者 https://www.devze.com 2023-02-07 18:00 出处:网络
There is this simple function which I have used with C++ in the past to simulate simple forms of tessellation. The function takes a number and a divisor. The divisor must be (a power of two - 1) and n

There is this simple function which I have used with C++ in the past to simulate simple forms of tessellation. The function takes a number and a divisor. The divisor must be (a power of two - 1) and n should be between 0 and divisor. It returns a modulus result of n % (d+1) using bitwise &.

Fairly sure the function goes like:

unsigned int BitwiseMod(unsigned int n, unsigned int d){ return n & d; }

I am wanting to use this effectively in OpenCL and am wondering if it will work as I imagine it too. In my mind, modulus is a very expensive operation on the GPU but I am familiar using it to form magnitude spaces and other techniques to travel through data.

More often, I would be more likely to simply write this assuming functions have some overhead.

x[i] = 8*(i&d)+offset[i];  //OR in other contexts,...

num = i&d+offset[i];
x[num] = data;

The question is: Will this be useful or get in the way, if useful can you give me some examples where I might try to apply开发者_如何学编程 it.


On NVidia's architectures, GT200 and up, Modulo isn't particularly slow, not slower than a normal integer divide. See this paper for details.

However, using a bitwise AND is still quite a lot faster. As function calls are expensive on GPUs, OpenCL compilers aggressively use inlining to improve performance by default. You should be fine with a function call, as it will be inlined.

0

精彩评论

暂无评论...
验证码 换一张
取 消