Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Powerlist: a structure for parallel recursion
ACM Transactions on Programming Languages and Systems (TOPLAS)
Foundations of parallel programming
Foundations of parallel programming
Stages and transformations in parallel programming
Abstract machine models for parallel and distributed computing
Communications of the ACM
Systematic Extraction and Implementation of Divide-and-Conquer Parallelism
PLILP '96 Proceedings of the 8th International Symposium on Programming Languages: Implementations, Logics, and Programs
Systematic Efficient Parallelization of Scan and Other List Homomorphisms
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
Architecture Independent Massive Parallelization of Divide-and-Conquer Algorithms
MPC '95 Mathematics of Program Construction
Algorithmic skeletons: a structured approach to the management of parallel computation
Algorithmic skeletons: a structured approach to the management of parallel computation
Costing stepwise refinements of parallel programs
Computer Languages, Systems and Structures
Parallel FFT with Eden Skeletons
PaCT '09 Proceedings of the 10th International Conference on Parallel Computing Technologies
Recursion-driven parallel code generation for multi-core platforms
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
We demonstrate an approach to parallel programming, based onskeletons – parameterized program schemas withefficient implementations over diverse architectures. The contribution ofthe paper is two-fold: (1) we classify divide-and-conquer (DC) algorithmsand provide a family of provably correct parallel implementations for aparticular DC skeleton, called DH (distributable homomorphism); (2) weadjust the mathematical specification of the Fast Fourier Transform (FFT) tothe DH skeleton and, thereby, obtain a generic SPMD program, well suited forimplementation under MPI. The generic program includes the efficient FFTsolutions used in practice – the binary-exchange and the 2D- and3D-transpose implementations – as special cases.