Fibonacci heaps and their uses in improved network optimization algorithms
Journal of the ACM (JACM)
Code generation schema for modulo scheduled loops
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Scheduling and behavioral transformation for parallel systems
Scheduling and behavioral transformation for parallel systems
Bus-invert coding for low-power I/O
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Register allocation and binding for low power
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Instruction level power analysis and optimization of software
Journal of VLSI Signal Processing Systems - Special issue on technologies for wireless computing
Power analysis and minimization techniques for embedded DSP software
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Low-power memory mapping through reducing address bus activity
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reducing bus transition activity by limited weight coding with codeword slimming
GLSVLSI '00 Proceedings of the 10th Great Lakes symposium on VLSI
The design and use of simplepower: a cycle-accurate energy estimation tool
Proceedings of the 37th Annual Design Automation Conference
Power-aware modulo scheduling for high-performance VLIW processors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Exploiting VLIW schedule slacks for dynamic and leakage energy reduction
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
On achieving balanced power consumption in software pipelined loops
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Saving Power in the Control Path of Embedded Processors
IEEE Design & Test
Compiler optimization on VLIW instruction scheduling for low power
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Adapting instruction level parallelism for optimizing leakage in VLIW architectures
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Code size reduction technique and implementation for software-pipelined DSP applications
ACM Transactions on Embedded Computing Systems (TECS)
Adaptive low-power address encoding techniques using self-organizing lists
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Instruction Scheduling for Low Power
Journal of VLSI Signal Processing Systems
Rotation scheduling: a loop pipelining algorithm
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Optimizing near-ML MIMO detector for SDR baseband on parallel programmable architectures
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the conference on Design, automation and test in Europe
Generic multiphase software pipelined partial FFT on instruction level parallel architectures
IEEE Transactions on Signal Processing
Optimizing scheduling and intercluster connection for application-specific DSP processors
IEEE Transactions on Signal Processing
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Loop fusion and reordering for register file optimization on stream processors
Journal of Systems and Software
Instruction Cache Locking for Embedded Systems using Probability Profile
Journal of Signal Processing Systems
Hi-index | 0.01 |
In embedded systems, high-performance DSP needs to be performed not only with high-data throughput but also with low-power consumption. This article develops an instruction-level loop-scheduling technique to reduce both execution time and bus-switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can reduce both schedule length and bus-switching activities. Compared with the work of Lee et al. [2003], SAMLS shows an average 11.5% reduction in schedule length and an average 19.4% reduction in bus-switching activities.