Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Communication-free hyperplane partitioning of nested loops
Journal of Parallel and Distributed Computing
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
The Omega Library interface guide
The Omega Library interface guide
Minimizing communication while preserving parallelism
ICS '96 Proceedings of the 10th international conference on Supercomputing
Transitive closure of infinite graphs and its applications
International Journal of Parallel Programming - Special issue: selected papers from the eighth international workshop on languages and compilers for parallel computing
Maximizing parallelism and minimizing synchronization with affine transforms
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Loop parallelization algorithms: from parallelism extraction to code generation
Parallel Computing - Special issues on languages and compilers for parallel computers
Scheduling and Automatic Parallelization
Scheduling and Automatic Parallelization
An Exact Method for Analysis of Value-based Array Data Dependences
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Communication-Free Parallelization via Affine Transformations
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Classifying Loops for Space-Time Mapping
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
Finding synchronization-free parallelism for non-uniform loops
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartII
Hi-index | 0.00 |
Algorithms, permitting us to find synchronization-free threads comprised of iterations of perfectly nested uniform and non-uniform loops, are presented. They require an exact representation of loop-carried dependences. To describe and implement the algorithms, the dependence analysis by Pugh and Wonnacott was chosen where dependences are represented in the form of tuple relations. The main advantage of the proposed approach is that it permits us to extract more synchronization-free parallelism than that yielded with well-known techniques including the affine partitioning framework. The algorithms proposed have been implemented and verified by means of the Omega project software. Experiments with the Livermore loops are presented.