On Time Optimal Supernode Shape

Authors:
Edin Hodzic;Weijia Shang
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2002

Citing 21
Cited 11

Theory of linear and integer programming

Theory of linear and integer programming
Supernode partitioning

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Scanning polyhedra with DO loops

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Practical dependence testing

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies

IEEE Transactions on Computers
Independent Partitioning of Algorithms with Uniform Dependencies

IEEE Transactions on Computers
Parallel computing (2nd ed.): theory and practice

Parallel computing (2nd ed.): theory and practice
(Pen)-ultimate tiling?

Integration, the VLSI Journal
Optimal tile size adjustment in compiling general DOACROSS loop nests

ICS '95 Proceedings of the 9th international conference on Supercomputing
Communication-minimal tiling of uniform dependence loops

Journal of Parallel and Distributed Computing
Optimal orthogonal tiling of 2-D iterations

Journal of Parallel and Distributed Computing
Selecting tile shape for minimal execution time

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
The Organization of Computations for Uniform Recurrence Equations

Journal of the ACM (JACM)
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering

Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
On Supernode Transformation with Minimized Total Running Time

IEEE Transactions on Parallel and Distributed Systems
Optimal Orthogonal Tiling

Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Two-dimensional orthogonal tiling: from theory to practice

HIPC '96 Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)
(R) On Optimal Size and Shape of Supernode Transformations

ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
Time-optimal tiling of algorithms with uniform dependencies for distributed memory parallel computers

Time-optimal tiling of algorithms with uniform dependencies for distributed memory parallel computers

Automatic parallel code generation for tiled nested loops

Proceedings of the 2004 ACM symposium on Applied computing
Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs

The Journal of Supercomputing
A New Genetic Algorithm for Loop Tiling

The Journal of Supercomputing
Message-passing code generation for non-rectangular tiling transformations

Parallel Computing
Effective automatic parallelization of stencil computations

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Global Tiling for Communication Minimal Parallelization on Distributed Memory Systems

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Tiling optimization in numerically solving a multidimensional heat equation on a ring of processors

Cybernetics and Systems Analysis
Selecting the tile shape to reduce the total communication volume

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
On supernode transformations and multithreading for the longest common subsequence problem

AusPDC '12 Proceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 127
A Case Study of Implementing Supernode Transformations

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses the selection of an optimal supernode shape of a supernode transformation (also known as tiling). We identify three parameters of a supernode transformation: supernode size, relative side lengths, and cutting hyperplane directions. For supernode transformations on algorithms with perfectly nested loops and uniform dependencies, we prove the optimality of a constant linear schedule vector and give a necessary and sufficient condition for optimal relative side lengths. We also prove that the total running time is minimized by a cutting hyperplane direction matrix from a particular subset of all valid directions and we discuss the cases where this subset is unique. The results are derived in continuous space and should be considered approximate. Our model does not include cache effects and assumes an unbounded number of available processors, the communication cost approximated by a constant, uniform dependences, and loop bounds known at compile time. A comprehensive example is discussed with an application of the results to the Jacobi algorithm.