Loop scheduling with timing and switching-activity minimization for VLIW DSP

Authors:
Zili Shao;Bin Xiao;Chun Xue;Qingfeng Zhuge;Edwin H.-M. Sha
Affiliations:
Hong Kong Polytechnic University, Kowloon, Hong Kong;Hong Kong Polytechnic University, Kowloon, Hong Kong;University of Texas at Dallas, Richardson, TX;University of Texas at Dallas, Richardson, TX;University of Texas at Dallas, Richardson, TX
Venue:
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Year:
2006

Citing 21
Cited 7

Fibonacci heaps and their uses in improved network optimization algorithms

Journal of the ACM (JACM)
Code generation schema for modulo scheduled loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Scheduling and behavioral transformation for parallel systems

Scheduling and behavioral transformation for parallel systems
Bus-invert coding for low-power I/O

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Register allocation and binding for low power

DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Instruction level power analysis and optimization of software

Journal of VLSI Signal Processing Systems - Special issue on technologies for wireless computing
Power analysis and minimization techniques for embedded DSP software

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Low-power memory mapping through reducing address bus activity

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reducing bus transition activity by limited weight coding with codeword slimming

GLSVLSI '00 Proceedings of the 10th Great Lakes symposium on VLSI
The design and use of simplepower: a cycle-accurate energy estimation tool

Proceedings of the 37th Annual Design Automation Conference
Power-aware modulo scheduling for high-performance VLIW processors

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Synthesis and Optimization of Digital Circuits

Synthesis and Optimization of Digital Circuits
Exploiting VLIW schedule slacks for dynamic and leakage energy reduction

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
On achieving balanced power consumption in software pipelined loops

CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Saving Power in the Control Path of Embedded Processors

IEEE Design & Test
Compiler optimization on VLIW instruction scheduling for low power

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Adapting instruction level parallelism for optimizing leakage in VLIW architectures

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Code size reduction technique and implementation for software-pipelined DSP applications

ACM Transactions on Embedded Computing Systems (TECS)
Adaptive low-power address encoding techniques using self-organizing lists

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Instruction Scheduling for Low Power

Journal of VLSI Signal Processing Systems
Rotation scheduling: a loop pipelining algorithm

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Optimizing near-ML MIMO detector for SDR baseband on parallel programmable architectures

Proceedings of the conference on Design, automation and test in Europe
Generic multi-phase software-pipelined Partial-FFT on instruction-level-parallel architectures and SDR baseband applications

Proceedings of the conference on Design, automation and test in Europe
Generic multiphase software pipelined partial FFT on instruction level parallel architectures

IEEE Transactions on Signal Processing
Optimizing scheduling and intercluster connection for application-specific DSP processors

IEEE Transactions on Signal Processing
Unification of scheduling, binding, and retiming to reduce power consumption under timings and resources constraints

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Loop fusion and reordering for register file optimization on stream processors

Journal of Systems and Software
Instruction Cache Locking for Embedded Systems using Probability Profile

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

In embedded systems, high-performance DSP needs to be performed not only with high-data throughput but also with low-power consumption. This article develops an instruction-level loop-scheduling technique to reduce both execution time and bus-switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can reduce both schedule length and bus-switching activities. Compared with the work of Lee et al. [2003], SAMLS shows an average 11.5% reduction in schedule length and an average 19.4% reduction in bus-switching activities.