Scheduling and partitioning for multiple loop nests

Authors:
Zhong Wang;Qingfeng Zhuge;Edwin H.-M. Sha
Affiliations:
University of Notre Dame, Notre Dame, IN;University of Texas at Dallas, Richardson, TX;University of Texas at Dallas, Richardson, TX
Venue:
Proceedings of the 14th international symposium on Systems synthesis
Year:
2001

Citing 9
Cited 5

Achieving Full Parallelism Using Multidimensional Retiming

IEEE Transactions on Parallel and Distributed Systems
Fusion of Loops for Parallelism and Locality

IEEE Transactions on Parallel and Distributed Systems
Optimal weighted loop fusion for parallel programs

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Scheduling of uniform multidimensional systems under resource constraints

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts

IEEE Transactions on Parallel and Distributed Systems
A tile selection algorithm for data locality and cache interference

ICS '99 Proceedings of the 13th international conference on Supercomputing
Optimal two level partitioning and loop scheduling for hiding memory latency for DSP applications

Proceedings of the 37th Annual Design Automation Conference
On Uniformization of Affine Dependence Algorithms

IEEE Transactions on Computers
Loop Scheduling and Partitions for Hiding Memory Latencies

Proceedings of the 12th international symposium on System synthesis

Loop Scheduling with Complete Memory Latency Hiding on Multi-core Architecture

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Effective loop partitioning and scheduling under memory and register dual constraints

Proceedings of the conference on Design, automation and test in Europe
Optimizing parallelism for nested loops with iterational and instructional retiming

Journal of Embedded Computing - Selected papers of EUC 2005
Iterational retiming with partitioning: Loop scheduling with complete memory latency hiding

ACM Transactions on Embedded Computing Systems (TECS)
Optimizing nested loops with iterational and instructional retiming

EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the multiple loop partition scheduling technique, which combines the loop partition and prefetching. It can exploit the data locality better than the traditional loop partition, which only focus on a singleton nested loop, and loop fusion. Moreover, multiple loop partition scheduling balances the computation and memory loading, such that the long memory latency can be hidden effectively. The experiments shows that multiple loop partition scheduling can achieve the significant improvement over the existed methods.