SIAM Journal on Computing
Fast Parallel Sorting Under LogP: Experience with the CM-5
IEEE Transactions on Parallel and Distributed Systems
Load balanced parallel radix sort
ICS '98 Proceedings of the 12th international conference on Supercomputing
Communication-Efficient Bitonic Sort on a Distributed Memory Parallel Computer
ICPADS '01 Proceedings of the Eighth International Conference on Parallel and Distributed Systems
High-speed parallel external sorting of data with arbitrary distribution
International Journal of High Performance Computing and Networking
Parallel external sort of floating-point data by integer conversion
ACC'08 Proceedings of the WSEAS International Conference on Applied Computing Conference
Hi-index | 0.00 |
Parallel merge sort is useful for sorting a large quantity of data progressively. The merge sort should be parallelized carefully since the conventional algorithm has poor performance due to the successive reduction of the number of participating processors by half, and down to one in the last merging stage. The proposed load-balanced merge sort utilizes all processors throughout the computation. It evenly distributes data to all processors in each stage. Thus every processor is forced to work in all phases. Significant performance enhancement has been achieved up to a speedup of (P−1)/log P where P is the number of processors. Experimental results demonstrate a speedup of 9.6 (upper bound of 10.7) on 32-processor Cray T3E when sorting 4M 32-bit integers, and a speed up of 2.3 (upper bound of 2.8) on an 8-node PC cluster.