Fast parallel sorting algorithms
Communications of the ACM
Merging with parallel processors
Communications of the ACM
Routing, merging and sorting on parallel models of computation
STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
IEEE Transactions on Computers
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Optimal parallel merging and sorting without memory conflicts
IEEE Transactions on Computers
Parallelizability of Some P-Complete Problems
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Average-Case Communication-Optimal Parallel Parenthesis Matching
ISAAC '02 Proceedings of the 13th International Symposium on Algorithms and Computation
Massively Parallel Suffix Array Construction
SOFSEM '98 Proceedings of the 25th Conference on Current Trends in Theory and Practice of Informatics: Theory and Practice of Informatics
Large-Capacity High-Throughput Low-Cost Pipelined CAM Using Pipelined CTAM
IEEE Transactions on Computers
Communication-optimal parallel parenthesis matching
Parallel Computing
Information Processing Letters
International Journal of Computers and Applications
Algorithms and theory of computation handbook
Merging data records on EREW PRAM
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Hi-index | 14.99 |
We study the number of comparison steps required for searching, merging, and sorting with P processors. We present a merging algorithm that is optimal up to a constant factor when merging two lists of equal size (independent of the number of processors); as a special case, with N processors it merges two lists, each of size N, in 1.893 lg lg N + 4 comparison steps. We use the merging algorithm to obtain a sorting algorithm that, in particular, sorts N values with N processors in 1.893 lg N lg lg N/lg lg lg N(plus lower order terms) comparison steps. The algorithms can be implemented on a shared memory machine that allows concurrent reads from the same location with constant overhead at each comparison step.