The periodic balanced sorting network
Journal of the ACM (JACM)
A randomized parallel sorting algorithm with an experimental study
Journal of Parallel and Distributed Computing
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Efficient conditional operations for data-parallel architectures
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Imagine: Media Processing with Streams
IEEE Micro
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Photon mapping on programmable graphics hardware
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared Memory Machines
Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared Memory Machines
UberFlow: a GPU-based particle engine
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Fast on-line index construction by geometric partitioning
Proceedings of the 14th ACM international conference on Information and knowledge management
GPUTeraSort: high performance graphics co-processor sorting for large database management
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Scan primitives for GPU computing
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Introduction to Information Retrieval
Introduction to Information Retrieval
A Practical Quicksort Algorithm for Graphics Processors
ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
MapReduce for the cell broadband engine architecture
IBM Journal of Research and Development
SPDP '93 Proceedings of the 1993 5th IEEE Symposium on Parallel and Distributed Processing
Bitonic sort on a chained-cubic tree interconnection network
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Although sort has been extensively studied in many research works, it still remains a challenge in particular if we consider the implications of novel processor technologies such as manycores (i.e. GPUs, Cell/BE, multicore, etc.). In this paper, we compare different algorithms for sorting integers on stream multiprocessors and we discuss their viability on large datasets (such as those managed by search engines). In order to fully exploit the potentiality of the underlying architecture, we designed an optimized version of sorting network in the K-model, a novel computational model designed to consider all the important features of many-core architectures. According to K-model, our bitonic sorting network mapping improves the three main aspects of many-core architectures, i.e. the processors exploitation, and the on-chip/off-chip memory bandwidth utilization. Furthermore we are able to attain a space complexity of @Q(1). We experimentally compare our solution with state-of-the-art ones (namely, Quicksort and Radixsort) on GPUs. We also compute the complexity in the K-model for such algorithms. The conducted evaluation highlight that our bitonic sorting network is faster than Quicksort and slightly slower than radix, yet being an in-place solution it consumes less memory than both algorithms.