Deterministic selection in O(loglog N) parallel time
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
Algorithms
An optimally efficient selection algorithm
Information Processing Letters
Scientific computing on vector computers
Scientific computing on vector computers
The design and analysis of parallel algorithms
The design and analysis of parallel algorithms
Introduction to algorithms
A report on the Sisal language project
Journal of Parallel and Distributed Computing - Special issue: data-flow processing
The C programming language
Retire Fortran?: a debate rekindled
Communications of the ACM
Performance of various computers using standard linear equations software
ACM SIGARCH Computer Architecture News
Advanced Array Optimizations for High Performance Functional Languages
IEEE Transactions on Parallel and Distributed Systems
Practical Algorithms for Selection on Coarse-Grained Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
Expected time bounds for selection
Communications of the ACM
Algorithm 489: the algorithm SELECT—for finding the ith smallest of n elements [M1]
Communications of the ACM
Communications of the ACM
Communications of the ACM
Communications of the ACM
Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Performance of Various Computers Using Standard Linear Equations Software
Performance of Various Computers Using Standard Linear Equations Software
Hi-index | 0.01 |
The selection problem has been studied extensively on sequential machines. A linear average time solution and a linear worst-case solution are considered as the standard by most researchers. Theoretical work is also available on parallel models, but it has not been widely implemented on parallel machines. This paper presents an in-depth analysis of the implementation of the standard algorithms, on a number of multiprocessors and supercomputers from the entire spectrum of Flynn's classification, using both an imperative (C based languages with vendor specific parallel extensions) and a functional (SISAL) language. Very interesting results were obtained for all of the experiments performed, leading us to the conclusion that the selection problem has very efficient parallel implementations. Hand-tuned C programs with parallel extensions provided good efficiency but were time-consuming in terms of development. On the other hand, the SISAL code is fully portable and the same program was used on all the machines. The performances of SISAL implementations were comparable to the ones of the hand-tuned C implementations. On all the tests, the routines were able to sustain good speed-up and reasonable efficiency, even with a large number of processors. In two cases (one machine using SISAL, and one using a C-based language), we were able to obtain an efficiency higher than 80% with a configuration close or equal to the maximum number of processors.