SIAM Journal on Computing
Fast Parallel Sorting Under LogP: Experience with the CM-5
IEEE Transactions on Parallel and Distributed Systems
Load balanced parallel radix sort
ICS '98 Proceedings of the 12th international conference on Supercomputing
Partitioned parallel radix sort
Journal of Parallel and Distributed Computing
Parallel Median Splitting and k-Splitting with Application to Merging and Sorting
IEEE Transactions on Parallel and Distributed Systems
Communication-Efficient Bitonic Sort on a Distributed Memory Parallel Computer
ICPADS '01 Proceedings of the Eighth International Conference on Parallel and Distributed Systems
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Paper: Performance parameters and benchmarking of supercomputers
Parallel Computing
Hi-index | 0.00 |
Merge sort is useful in sorting a great number of data progressively, especially when they can be partitioned and easily collected to a few processors. Merge sort can be parallelized, however, conventional algorithms using distributed memory computers have poor performance due to the successive reduction of the number of participating processors by a half, up to one in the last merging stage.This paper presents load-balanced parallel merge sort where all processors do the merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in all merging phases. An analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P where P is the number of processors. We have reached a speedup of 8.2 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers.