Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Computer graphics: principles and practice (2nd ed.)
Computer graphics: principles and practice (2nd ed.)
Selected papers of the second workshop on Languages and compilers for parallel computing
Symbolic analysis for parallelizing compilers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Parallel Computing
Matrix analysis and applied linear algebra
Matrix analysis and applied linear algebra
Loop Transformations for Restructuring Compilers: The Foundations
Loop Transformations for Restructuring Compilers: The Foundations
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Partitioning and Labeling of Loops by Unimodular Transformations
IEEE Transactions on Parallel and Distributed Systems
Mapping affine loop nests: new results
HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Symbolic Program Analysis and Optimization for Parallelizing Compilers
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Mapping nested loops onto distributed memory multiprocessors
ICPADS '97 Proceedings of the 1997 International Conference on Parallel and Distributed Systems
Symbolic Analysis: A Basis for Parallelization, Optimization, and Scheduling of Programs
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
A geometric approach for partitioning n-dimensional non-rectangular iteration spaces
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
A general approach for partitioning N-dimensional parallel nested loops with conditionals
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Cache-aware iteration space partitioning
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Cache-aware partitioning of multi-dimensional iteration spaces
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Hi-index | 0.00 |
Efficient partitioning of parallel loops plays a critical role in high performance and efficient use of multiprocessor systems. Although a significant amount of work has been done in partitioning and scheduling of loops with rectangular iteration spaces, the problem of partitioning non-rectangular iteration spaces --- e.g., triangular, trapezoidal iteration spaces --- with variable densities has not been addressed so far to the best of our knowledge. In this paper, we present a mathematical model for partitioning N-dimensional non-rectangular iteration spaces with variable densities. We present a unimodular loop transformation and a geometric approach for partitioning an iteration space along an axis corresponding to the outermost loop across a given number of processors to achieve near-optimal performance, i.e., to achieve near-optimal load balance across different processors. We present a case study to illustrate the effectiveness of our approach.