开发者

Using Macros to Define Constants for CUDA

开发者 https://www.devze.com 2022-12-22 08:49 出处:网络
I\'m trying to reduce the number of instructions and constant memory reads for a CUDA kernel. As a result, I have realised that I can pull out the tile sizes from constant memory and turn them into m

I'm trying to reduce the number of instructions and constant memory reads for a CUDA kernel.

As a result, I have realised that I can pull out the tile sizes from constant memory and turn them into macros. How do I define macros that evaluate to constants during preprocessing so that I can simply adjust three values and reduce the number of instructions performed in each kernel?

Here's an e开发者_高级运维xample:

#define TX 8
#define TY 6
#define TZ 4

#define TX2 (TX * 2)
#define TY2 (TY * 2)

#define OVER_TX (1.0f / float(TX))

Maybe this is already the case (or possibly handled by the nvcc compiler), but clearly I want the second block of macros to be evaluated by the preprocessor rather than replaced in the code so that it is not performed in every kernel. Any suggestions?


Modern compilers will typically evaluate constants such as this at compile-time wherever possible, so you should be OK. This is also true for properly defined constants (i.e. using const rather than the "old skool" #define method).

0

精彩评论

暂无评论...
验证码 换一张
取 消