Free scheduling for statement instances of parameterized arbitrarily nested affine loops

Authors:
Wlodzimierz Bielecki;Marek Palkowski;Tomasz Klimek
Affiliations:
West Pomeranian University of Technology, Szczecin, Poland;West Pomeranian University of Technology, Szczecin, Poland;West Pomeranian University of Technology, Szczecin, Poland
Venue:
Parallel Computing
Year:
2012

Citing 25
Cited 0

Compiler algorithms for synchronization

IEEE Transactions on Computers
A framework for unifying reordering transformations

A framework for unifying reordering transformations
Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
The Omega Library interface guide

The Omega Library interface guide
Compiler optimizations for parallel loops with fine-grained synchronization

Compiler optimizations for parallel loops with fine-grained synchronization
Transitive closure of infinite graphs and its applications

International Journal of Parallel Programming - Special issue: selected papers from the eighth international workshop on languages and compilers for parallel computing
Affine scheduling on bounded convex polyhedric domains is asymptotically optimal

Theoretical Computer Science - Special issue on parallel computing
An affine partitioning algorithm to maximize parallelism and minimize communication

ICS '99 Proceedings of the 13th international conference on Supercomputing
Generation of Efficient Nested Loops from Polyhedra

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Index set splitting

International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Scheduling and Automatic Parallelization

Scheduling and Automatic Parallelization
Removal of Redundant Dependences in DOACROSS Loops with Constant Dependences

IEEE Transactions on Parallel and Distributed Systems
Dependence Uniformization: A Loop Parallelization Technique

IEEE Transactions on Parallel and Distributed Systems
Constructive Methods for Scheduling Uniform Loop Nests

IEEE Transactions on Parallel and Distributed Systems
An Exact Method for Analysis of Value-based Array Data Dependences

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
On the Optimality of Feautrier's Scheduling Algorithm

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Code Generation in the Polyhedral Model Is Easier Than You Think

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Parallel Region Execution of Loops with Irregular Dependencies

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 02
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Computing the Transitive Closure of a Union of Affine Integer Tuple Relations

COCOA '09 Proceedings of the 3rd International Conference on Combinatorial Optimization and Applications
An iterative algorithm of computing the transitive closure of a union of parameterized affine integer tuple relations

COCOA'10 Proceedings of the 4th international conference on Combinatorial optimization and applications - Volume Part I
Coarse-grained loop parallelization: Iteration Space Slicing vs affine transformations

Parallel Computing
Polyhedral code generation in the real world

CC'06 Proceedings of the 15th international conference on Compiler Construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

An approach is presented permitting us to build free scheduling for statement instances of affine loops. Under the free schedule, loop statement instances are executed as soon as their operands are available. This allows us to extract maximal fine-grained loop parallelism and minimize the number of synchronization events. The approach is based on calculating the power k of a relation representing exactly all dependences in a loop. In general, such a relation is a union of simpler relations. When there are troubles with calculating free scheduling due to the large number of simpler dependence relations, another technique is discussed allowing for extracting free scheduling in an iteration subspace defined by indices of inner nests of this loop. We demonstrate that if we are able to calculate the power k of a dependence relation describing all dependences in the loop, then we are able also to produce free scheduling. Experimental results exposing the effectiveness, efficiency, and time complexity of the algorithms are outlined. Problems to be resolved in the future to utilize the entire power of the presented techniques are discussed.