Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Transitive closure of infinite graphs and its applications
International Journal of Parallel Programming - Special issue: selected papers from the eighth international workshop on languages and compilers for parallel computing
Iteration space slicing and its application to communication optimization
ICS '97 Proceedings of the 11th international conference on Supercomputing
An affine partitioning algorithm to maximize parallelism and minimize communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Scheduling and Automatic Parallelization
Scheduling and Automatic Parallelization
An Empirical Study of Fortran Programs for Parallelizing Compilers
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Labeling of Loops by Unimodular Transformations
IEEE Transactions on Parallel and Distributed Systems
Mapping affine loop nests: new results
HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Hyperplane Partitioning: An Approach to Global Data Partitioning for Distributed Memory Machines
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
An Exact Method for Analysis of Value-based Array Data Dependences
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Solving Alignment Using Elementary Linear Algebra
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Scanning Polyhedra without Do-loops
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Non-Uniform Dependences Partitioned by Recurrence Chains
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Code Generation in the Polyhedral Model Is Easier Than You Think
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
A practical automatic polyhedral parallelizer and locality optimizer
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Computing the Transitive Closure of a Union of Affine Integer Tuple Relations
COCOA '09 Proceedings of the 3rd International Conference on Combinatorial Optimization and Applications
Polyhedral code generation in the real world
CC'06 Proceedings of the 15th international conference on Compiler Construction
Automatic privatization for parallel execution of loops
ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Using free scheduling for programming graphic cards
Facing the Multicore-Challenge II
Parallel partitioning for distributed systems using sequential assignment
Journal of Parallel and Distributed Computing
A direct method for optimal VLSI realization of deeply nested n-D loop problems
Microprocessors & Microsystems
Hi-index | 0.00 |
Automatic coarse-grained parallelization of program loops is of great importance for parallel computing systems. This paper presents the theory of Iteration Space Slicing aimed at extracting synchronization-free parallelism available in arbitrarily nested program loops. We demonstrate that Iteration Space Slicing algorithms permits for extracting more coarse-grained parallelism than that extracted by means of the Affine Transformation Framework provided that we are able to calculate the transitive closure of the union of relations describing all dependences in the affine loop. Experimental results show that by means of Iteration Space Slicing algorithms, we are able to extract coarse-grained parallelism for many loops of NAS and UTDSP benchmarks. Problems to be resolved in order to enhance the theory of Iteration Space Slicing are discussed.