For example I have array of (x,y) points and I want to organ开发者_如何学Goize them in kd-tree
Building kd-tree includes sorting and computing bounding boxes. These algorithms work fine on CUDA, but is there any way to build kd-tree utilizing as many threads as possible?
I think there should be some tricks:
Usually, kd-tree is implemented with recursion, but as far as I know, CUDA processors don't have hardware stack, so recursion should be avoided.
How can I build kd-tree in Cuda effectively?
You might want to have a look at the following papers:
Stackless KD-Tree Traversal for High Performance GPU Ray Tracing
Real-Time KD-Tree Construction on Graphics Hardware
They might help you along. Google them and you'll find them available online.
精彩评论