I am developing a CUDA 4.0 application running on a Fermi card. According to the specs, Fermi has Compute Capability 2.0 and therefore should support non-inlined function calls.
I compile every class I have with nvcc 4.0 in a distinct obj file. Then, I l开发者_运维问答ink them all with g++-4.4.
Consider the following code :
[File A.cuh]
#include <cuda_runtime.h>
struct A
{
__device__ __host__ void functionA();
};
[File B.cuh]
#include <cuda_runtime.h>
struct B
{
__device__ __host__ void functionB();
};
[File A.cu]
#include "A.cuh"
#include "B.cuh"
void A::functionA()
{
B b;
b.functionB();
}
Attempting to compile A.cu with nvcc -o A.o -c A.cu -arch=sm_20
outputs Error: External calls are not supported (found non-inlined call to _ZN1B9functionBEv)
.
I must be doing something wrong, but what ?
As explained on this thread on the NVidia forums, it appears that even though Fermi supports non-inlined functions, nvcc still needs to have all the functions available during compilation, i.e. in the same source file: there is no linker (yep, that's a pity...).
functionB
is not declared and therefore considered external call. As the error said external calls are not supported. Implement functionB
and it will work.
True, CUDA 5.0 does it. I can't get it to expose external device variables but device methods work just fine. Not by default.
The nvcc option is "-rdc=true". In Visual Studio and Nsight it is an option in the project properties under Configuration Properties -> CUDA C/C++ -> Common -> Generate Relocatable Device Code.
精彩评论