The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Journal of Functional Programming
Fast parallel GPU-sorting using a hybrid algorithm
Journal of Parallel and Distributed Computing
Nikola: embedding compiled GPU functions in Haskell
Proceedings of the third ACM Haskell symposium on Haskell
Accelerating Haskell array codes with multicore GPUs
Proceedings of the sixth workshop on Declarative aspects of multicore programming
Expressive array constructs in an embedded GPU kernel programming language
DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
Design and implementation of an efficient integer count sort in CUDA GPUs
Concurrency and Computation: Practice & Experience
Maximizing parallelism in the construction of BVHs, octrees, and k-d trees
EGGH-HPG'12 Proceedings of the Fourth ACM SIGGRAPH / Eurographics conference on High-Performance Graphics
Clustered deferred and forward shading
EGGH-HPG'12 Proceedings of the Fourth ACM SIGGRAPH / Eurographics conference on High-Performance Graphics
Embrace, defend, extend: a methodology for embedding preexisting DSLs
Proceedings of the 1st annual workshop on Functional programming concepts in domain-specific languages
Hi-index | 0.00 |
This paper investigates two sorting algorithms: counting sort and a variation, occurrence sort, which also removes duplicate elements, and examines their suitability for running on the GPU. The duplicate removing variation turns out to have a natural functional, data-parallel implementation which makes it particularly interesting for GPUs. The algorithms are implemented in Obsidian, a high-level domain specific language for GPU programming. Measurements show that our implementations in many cases outperform the sorting algorithm provided by the library Thrust. Furthermore, occurrence sort is another factor of two faster than ordinary counting sort. We conclude that counting sort is an important contender when considering sorting algorithms for the GPU, and that occurrence sort is highly preferable when applicable. We also show that Obsidian can produce very competitive code.