i have lines of 3 hashes (ie md5, 128bit). actually plenty of them. think billions, thus they wont fit in main memory. they are in a file and need to get sorted. using gnu sort it takes a long time obviously, but it works.
i think it could be possibly worth to split them ie into a vector of 6 64bit ints and sort them in batches with opencl, then mergejoin them. i have a radeon hd 6950 with 2gb at hand.
but i have no experience with opencl.
so the questions:
which opencl datastructure would i want to use for this task?
- 开发者_Python百科
which sorting algo would i use
could the mergejoin also be accelerated?
Since it is on disk I would just use STLXXL.
http://stxxl.sourceforge.net/
There is OpenCL code.... but try this first :)
精彩评论