The design and analysis of parallel algorithms
The design and analysis of parallel algorithms
Introduction to algorithms
The DARPA image understanding benchmark for parallel computers
Journal of Parallel and Distributed Computing
Cluster identification algorithms for spin models—sequential and parallel
Concurrency: Practice and Experience
An introduction to parallel algorithms
An introduction to parallel algorithms
Load balancing data parallel programs on distributed memory computers
Parallel Computing
Efficient algorithms for all-to-all communications in multi-port message-passing systems
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Parallelizing molecular dynamics programs for distributed memory machines: an application of the CHAOS runtime support library
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel algorithms for image histogramming and connected components with an experimental study
Parallel algorithms for image histogramming and connected components with an experimental study
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Efficient Parallel Algorithms for Selection and Searching on Sorted Matrices
IPPS '92 Proceedings of the 6th International Parallel Processing Symposium
The Block Distributed Memory Model for Shared Memory Multiprocessors
Proceedings of the 8th International Symposium on Parallel Processing
Parallel remapping algorithms for adaptive problems
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Modeling Parallel Sorts with LogP on the CM-5
Modeling Parallel Sorts with LogP on the CM-5
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
Practical parallel algorithms for personalized communication and integer sorting
Journal of Experimental Algorithmics (JEA)
The Block Distributed Memory Model
IEEE Transactions on Parallel and Distributed Systems
Modeling parallel bandwidth: local vs. global restrictions
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A new deterministic parallel sorting algorithm with an experimental evaluation
Journal of Experimental Algorithmics (JEA)
Efficient selection algorithms on distributed memory computers
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
3D in the Pines and on the Plains
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Parallel FFT on ATM-based networks of workstations
Cluster Computing
Parallel Implementations of the Selection Problem: A Case Study
International Journal of Parallel Programming
Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Solving Irregular Inter-processor Data Dependency in Image Understanding Tasks
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
Algorithm engineering for parallel computation
Experimental algorithmics
An improved, randomized algorithm for parallel selection with an experimental study
Journal of Parallel and Distributed Computing
A parallel primal-dual simplex algorithm
Operations Research Letters
Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.01 |
A common statistical problem is that of finding the median element in a set of data. This paper presents a fast and portable parallel algorithm for finding the median given a set of elements distributed across a parallel machine. In fact, our algorithm solves the general selection problem that requires the determination of the element of rank i, for an arbitrarily given integer i. Practical algorithms needed by our selection algorithm for the dynamic redistribution of data are also discussed. Our general framework is a distributed memory programming model enhanced by a set of communication primitives. We use efficient techniques for distributing, coalescing, and load balancing data as well as efficient combinations of task and data parallelism. The algorithms have been coded in Split-C and run on a variety of platforms, including the Thinking Machines CM-5, IBM SP-1 and SP-2, Cray Research T3D, Meiko Scientific CS-2, Intel Paragon, and workstation clusters. Our experimental results illustrate the scalability and efficiency of our algorithms across different platforms and improve upon all the related experimental results known to the authors.