Tight bounds on the complexity of parallel sorting
IEEE Transactions on Computers
On communication latency in PRAM computations
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
A complexity theory of efficient parallel algorithms
Theoretical Computer Science - Special issue: Fifteenth international colloquium on automata, languages and programming, Tampere, Finland, July 1988
A bridging model for parallel computation
Communications of the ACM
A comparison of sorting algorithms for the connection machine CM-2
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
The Stanford Dash Multiprocessor
Computer
Computational frameworks for the fast Fourier transform
Computational frameworks for the fast Fourier transform
Parallel sorting by regular sampling
Journal of Parallel and Distributed Computing
Load balancing and routing on the hypercube and related networks
Journal of Parallel and Distributed Computing
Designing broadcasting algorithms in the postal model for message-passing systems
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal broadcast and summation in the LogP model
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Practical parallel algorithms for personalized communication and integer sorting
Practical parallel algorithms for personalized communication and integer sorting
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Efficient Algorithms for List Ranking and for Solving Graph Problems on the Hypercube
IEEE Transactions on Parallel and Distributed Systems
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Application Resource Requirement Estimation in a Parallel-Pipeline Model of Execution
IEEE Transactions on Parallel and Distributed Systems
PRO: a model for the design and analysis of efficient and scalable parallel algorithms
Nordic Journal of Computing
Hi-index | 0.00 |
We introduce a computation model for developing and analyzing parallel algorithms on distributed memory machines. The model allows the design of algorithms using a single address space and does not assume any particular interconnection topology. We capture performance by incorporating a cost measure for interprocessor communication induced by remote memory accesses. The cost measure includes parameters reflecting memory latency, communication bandwidth, and spatial locality. Our model allows the initial placement of the input data and pipelined prefetching.We use our model to develop parallel algorithms for various data rearrangement problems, load balancing, sorting, FFT, and matrix multiplication. We show that most of these algorithms achieve optimal or near optimal communication complexity while simultaneously guaranteeing an optimal speed-up in computational complexity. Ongoing experimental work in testing and evaluating these algorithms has thus far shown very promising results.