开发者

Error compiling template function in CUDA using nvcc

开发者 https://www.devze.com 2023-04-05 14:32 出处:网络
I have the following CUDA code: enum METHOD_E { METH_0 = 0, METH_1 }; template <enum METHOD_E METH> 开发者_如何学JAVAinline __device__ int test_func<METH>()

I have the following CUDA code:

enum METHOD_E {
    METH_0 = 0,
    METH_1
};

template <enum METHOD_E METH>
开发者_如何学JAVAinline __device__ int test_func<METH>()
{
    return int(METH);
}

__global__ void test_kernel()
{
    test_func<METH_0>();
}

void test()
{
    test_kernel<<<1, 1>>>();
}

When I compile I get the following error:

>nvcc --cuda test.cu
test.cu
test.cu(7): error: test_func is not a template

test.cu(14): error: identifier "test_func" is undefined

test.cu(14): error: expected an expression

3 errors detected in the compilation of "C:/Users/BLAH45~1/AppData/Local/Temp/tm
pxft_00000b60_00000000-6_test.cpp1.ii".

Section D.1.4 of the Programming Guide (4.0, the version of the toolkit I'm using) suggests templates should work, but I can't get them to.

Can anyone suggest a change to this code which makes it compile (without removing the templating!)?


Your test_func definition is wrong:

test_func () should be simply test_func ()

This works for me:

enum METHOD_E {
    METH_0 = 0,
    METH_1
};

template < enum METHOD_E METH>
__device__
inline
int test_func ()
{
    return int(METH);
}

__global__ void test_kernel()
{
    test_func<METH_0>();
}

void test()
{
    test_kernel<<<1, 1>>>();
}


Is this what you want, or did I get your problem wrong?

enum METHOD_E {
    METH_0 = 0,
    METH_1
};

template <enum METHOD_E METH>
inline __device__ int test_func()
{
    return int(METH);
}

template <>
inline __device__ int test_func<METH_0>()
{
    return -42;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消