IEEE Transactions on Computers
Bus-invert coding for low-power I/O
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Static scheduling for synthesis of DSP algorithms on various models
Journal of VLSI Signal Processing Systems
Resource-constrained loop list scheduler for DSP algorithms
Journal of VLSI Signal Processing Systems - Special issue on VLSI design methodologies for digital signal processing systems
Achieving Full Parallelism Using Multidimensional Retiming
IEEE Transactions on Parallel and Distributed Systems
Fusion of Loops for Parallelism and Locality
IEEE Transactions on Parallel and Distributed Systems
Optimal weighted loop fusion for parallel programs
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Voltage scheduling problem for dynamically variable voltage processors
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
ILP-based cost-optimal DSP synthesis with module selection and data format conversion
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Probabilistic Loop Scheduling for Applications with Uncertain Execution Time
IEEE Transactions on Computers
Influence of compiler optimizations on system power
Proceedings of the 37th Annual Design Automation Conference
Low-energy intra-task voltage scheduling using static timing analysis
Proceedings of the 38th annual Design Automation Conference
Estimating probabilistic timing performance for real-time embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - System Level Design
Energy-conscious compilation based on voltage scaling
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Loop fusion for clustered VLIW architectures
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Task scheduling and voltage selection for energy minimization
Proceedings of the 39th annual Design Automation Conference
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Power and performance evaluation of globally asynchronous locally synchronous processors
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Exploiting VLIW schedule slacks for dynamic and leakage energy reduction
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing power with dynamic critical path information
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
IEEE Transactions on Parallel and Distributed Systems
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Collective Loop Fusion for Array Contraction
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Optimizing Loop Performance for Clustered VLIW Architectures
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Enhancing Compiler Techniques for Memory Energy Optimizations
EMSOFT '02 Proceedings of the Second International Conference on Embedded Software
Dynamic frequency and voltage control for a multiple clock domain microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Energy reduction techniques for multimedia applications with tolerance to deadline misses
Proceedings of the 40th annual Design Automation Conference
A scheduling model for reduced CPU energy
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Exploring the Probabilistic Design Space of Multimedia Systems
RSP '03 Proceedings of the 14th IEEE International Workshop on Rapid System Prototyping (RSP'03)
ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Dynamic and Aggressive Scheduling Techniques for Power-Aware Real-Time Systems
RTSS '01 Proceedings of the 22nd IEEE Real-Time Systems Symposium
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Code size reduction technique and implementation for software-pipelined DSP applications
ACM Transactions on Embedded Computing Systems (TECS)
Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Platforms
IEEE Transactions on Parallel and Distributed Systems
Improving effective bandwidth through compiler enhancement of global cache reuse
Journal of Parallel and Distributed Computing
General loop fusion technique for nested loops considering timing and code size
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
The Energy Impact of Aggressive Loop Fusion
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Energy-aware variable partitioning and instruction scheduling for multibank memory architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Efficient Assignment and Scheduling for Heterogeneous DSP Systems
IEEE Transactions on Parallel and Distributed Systems
An evaluation of code and data optimizations in the context of disk power reduction
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Multiprocessor Energy-Efficient Scheduling for Real-Time Tasks with Different Power Characteristics
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
An Efficient Algorithm for Computing Optimal Discrete Voltage Schedules
SIAM Journal on Computing
Voltage setup problem for embedded systems with multiple voltages
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Efficient variable partitioning and scheduling for DSP processors with multiple memory modules
IEEE Transactions on Signal Processing
Rotation scheduling: a loop pipelining algorithm
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Energy-Aware Loop Parallelism Maximization for Multi-core DSP Architectures
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Staying-alive path planning with energy optimization for mobile robots
Expert Systems with Applications: An International Journal
The Journal of Supercomputing
Hi-index | 0.00 |
Energy saving is becoming one of the major design issues in processor architectures with multiple functional units (FUs). Nested loops are usually the most critical part in multimedia and high-performance DSP systems. There is a tradeoff between power saving and performance, such as timing constraint and code size requirement, of nested loops. This paper studies how to minimize the total energy while satisfying performance requirement for applications with multidimensional nested loops. An algorithm, energy minimization with loop fusion and FU schedule (EMLFS), is proposed. We first use retiming and partition to fuse nested loops. Then we use novel FU scheduling algorithms to maximize energy saving without sacrificing performance. The experimental results show that the average improvement on energy saving is significant by using our EMLFS algorithm.