Inexpensive implementations of set-associativity
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Instruction level power analysis and optimization of software
Journal of VLSI Signal Processing Systems - Special issue on technologies for wireless computing
Energy characterization based on clustering
DAC '96 Proceedings of the 33rd annual Design Automation Conference
Combining loop transformations considering caches and scheduling
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
A method of redundant clocking detection and power reduction at RT level design
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Analytical energy dissipation models for low-power caches
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Analysis of power consumption in memory hierarchies
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Low power design in deep submicron electronics
A power modeling and characterization method for macrocells using structure information
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Unroll-and-jam using uniformly generated sets
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A framework for estimation and minimizing energy dissipation of embedded HW/SW systems
DAC '98 Proceedings of the 35th annual Design Automation Conference
Validation of an architectural level power analysis technique
DAC '98 Proceedings of the 35th annual Design Automation Conference
High performance DSPs - what's hot and what's not?
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Emerging power management tools for processor design
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Advanced compiler design and implementation
Advanced compiler design and implementation
Improving locality using loop and data transformations in an integrated framework
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Cycle-accurate simulation of energy consumption in embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Way-predicting set-associative cache for high performance and low energy consumption
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Energy-driven integrated hardware-software optimizations using SimplePower
Proceedings of the 27th annual international symposium on Computer architecture
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Low Power Digital CMOS Design
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
M32R/D-Integrating DRAM and Microprocessor
IEEE Micro
Cache designs for energy efficiency
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Predictive sequential associative cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
VLSID '01 Proceedings of the The 14th International Conference on VLSI Design (VLSID '01)
Clock Power Issues in System-on-a-Chip Designs
WVLSI '99 Proceedings of the IEEE Computer Society Workshop on VLSI'99
Code generation and optimization for embedded digital signal processors
Code generation and optimization for embedded digital signal processors
Proceedings of the 2004 ACM symposium on Applied computing
Cycle-accurate power analysis for multiprocessor systems-on-a-chip
Proceedings of the 14th ACM Great Lakes symposium on VLSI
Journal of VLSI Signal Processing Systems
Efficient system-level prototyping of power-aware dynamic memory managers for embedded systems
Integration, the VLSI Journal - Special issue: Low-power design techniques
Instruction buffering exploration for low energy embedded processors
Journal of Embedded Computing - Low-power Embedded Systems
Optimizing data structures at the modeling level in embedded multimedia
Journal of Systems Architecture: the EUROMICRO Journal
Power macromodeling of MPSoC message passing primitives
ACM Transactions on Embedded Computing Systems (TECS) - Special Section LCTES'05
Optimization of dynamic memory managers for embedded systems using grammatical evolution
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Formally Specifying Dynamic Data Structures for Embedded Software Design: an Initial Approach
Electronic Notes in Theoretical Computer Science (ENTCS)
Efficient system-level prototyping of power-aware dynamic memory managers for embedded systems
Integration, the VLSI Journal - Special issue: Low-power design techniques
Simulation of high-performance memory allocators
Microprocessors & Microsystems
Hardware cost estimation for application-specific processor design
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Hi-index | 14.98 |
With the emergence of a plethora of embedded and portable applications, energy dissipation has joined throughput, VLSI layout area, and accuracy/precision as a major design constraint. Thus, designers must be concerned with both estimating and optimizing the energy consumption of circuits, architectures, and software. Most of the research in energy optimization and/or estimation has focused on single components of the system and has not looked across the interacting spectrum of the hardware and software. The novelty of our energy estimation framework, SimplePower, is that it evaluates the energy considering the system as a whole rather than just as a sum of parts, and that it concurrently supports both compiler and architectural experimentation. We present the design and use of the SimplePower framework that includes a transition-sensitive, cycle-accurate datapath energy model that interfaces with analytical and transition-sensitive energy models for the memory, clock and bus subsystems, respectively. Such an architectural-level energy estimation framework is invaluable in making good energy-conscious decisions early in the design cycle. We analyzed the energy consumption of 10 codes from the multidimensional array domain, a domain that is important for embedded video and signal processing systems. Our study shows that the pipeline registers and the register file are the datapath energy hotspots consuming 58-70 percent of overall datapath energy and that the clocking of the on-chip memory structures is the major source of the on-chip clock networks energy consumption. Further, we find that the off-chip main memory is the overall energy bottleneck of the entire system. However, we found that the application of high-level compiler optimizations reduces the main memory energy significantly, causing the contribution of the data cache, on-chip clock network, instruction cache, and datapath to become more important. We found that the improved locality of the optimized codes is useful in not only reducing the accesses to the main memory but also in exploiting the more energy-efficient cache architectures much better than unoptimized codes. Optimized codes saved 21 percent more energy using the most recently used way-prediction cache scheme as compared to executing unoptimized codes from the multidimensional array domain. We also observed that emerging technologies such as embedded DRAM coupled with a combination of energy-efficient circuit, architectural and compiler optimizations can potentially shift the energy hotspot. Thus, we have demonstrated that early estimates from the powerful SimplePower energy estimation framework can help one to identify the system energy hot spots and enable architects and compiler designers to focus their efforts on these areas.