Optimal two level partitioning and loop scheduling for hiding memory latency for DSP applications
Proceedings of the 37th Annual Design Automation Conference
Optimizing Overall Loop Schedules Using Prefetching and Partitioning
IEEE Transactions on Parallel and Distributed Systems
Optimal partitioning and balanced scheduling with the maximal overlap of data footprints
GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
Minimizing Average Schedule Length under Memory Constraints by Optimal Partitioning and Prefetching
Journal of VLSI Signal Processing Systems
Scheduling and partitioning for multiple loop nests
Proceedings of the 14th international symposium on Systems synthesis
Combined partitioning and data padding for scheduling multiple loop nests
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Loop Scheduling and Partitions for Hiding Memory Latencies
Proceedings of the 12th international symposium on System synthesis
Journal of Systems and Software - Special issue: Software engineering education and training
Loop Scheduling with Complete Memory Latency Hiding on Multi-core Architecture
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Partitioning and scheduling DSP applications with maximal memory access hiding
EURASIP Journal on Applied Signal Processing
Energy saving for memory with loop scheduling and prefetching
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Iterational retiming with partitioning: Loop scheduling with complete memory latency hiding
ACM Transactions on Embedded Computing Systems (TECS)
Execution Time Optimization Using Delayed Multidimensional Retiming
DS-RT '12 Proceedings of the 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications
Hi-index | 0.00 |
Multidimensional (MD) systems are widely used to model scientific applications such as image processing, geophysical signal processing, and fluid dynamics. Such systems, usually, contain repetitive groups of operations represented by nested loops. The optimization of such loops, considering processing resource constraints, is required in order to improve their computational time. Most of the existing static scheduling mechanisms, used in the high-level synthesis of very large scale integration (VLSI) architectures, do not consider the parallelism inherent to the multidimensional characteristics of the problem. This paper explores the basic properties of MD loop pipelining and presents two novel techniques, multidimensional rotation scheduling and push-up scheduling, able to achieve the shortest possible schedule length. These new techniques transform a multidimensional data flow graph representing the problem, while assigning the loop operations to a schedule table. The multidimensional rotation scheduling is an iterative "heuristic" method, depending upon user input, while the push-up scheduling algorithm is able to compute the new schedule in polynomial time. The optimal resulting schedule length and the efficiency of the algorithms are demonstrated by a series of practical experiments.