Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
Circuit implementation of a 300-MHz 64-bit second-generation CMOS Alpha CPU
Digital Technical Journal - Special 10th anniversary issue
Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor
Digital Technical Journal - Special 10th anniversary issue
Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Analytical energy dissipation models for low-power caches
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Continuous profiling: where have all the cycles gone?
Proceedings of the sixteenth ACM symposium on Operating systems principles
System support for automatic profiling and optimization
Proceedings of the sixteenth ACM symposium on Operating systems principles
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
ProfileMe: hardware support for instruction-level profiling on out-of-order processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor
Digital Technical Journal
Dynamic IPC/clock rate optimization
Proceedings of the 25th annual international symposium on Computer architecture
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
IEEE Micro
The Alpha 21264 Microprocessor
IEEE Micro
The Alpha 21264 Microprocessor Architecture
ICCD '98 Proceedings of the International Conference on Computer Design
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
A static power model for architects
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Frequent value compression in data caches
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Towards effective embedded processors in codesigns: customizable partitioned caches
Proceedings of the ninth international symposium on Hardware/software codesign
Power and energy reduction via pipeline balancing
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Variability in the execution of multimedia applications and implications for architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Power-aware partitioned cache architectures
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Morphable Cache Architectures: Potential Benefits
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Energy-efficient instruction cache using page-based placement
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Let caches decay: reducing leakage energy via exploitation of cache generational behavior
ACM Transactions on Computer Systems (TOCS)
Hardware and Software Techniques for Controlling DRAM Power Modes
IEEE Transactions on Computers
Compiler-directed cache polymorphism
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Managing multi-configuration hardware via dynamic working set analysis
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing set-associative cache energy via way-prediction and selective direct-mapping
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Saving energy with architectural and frequency adaptations for multimedia applications
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Fine-grain CAM-tag cache resizing using miss tags
Proceedings of the 2002 international symposium on Low power electronics and design
A case for dynamic pipeline scaling
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Increasing power efficiency of multi-core network processors through data filtering
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
On achieving balanced power consumption in software pipelined loops
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Application-driven processor design exploration for power-performance trade-off analysis
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Joint local and global hardware adaptations for energy
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Runtime Reconfiguration Techniques for Efficient General-Purpose Computation
IEEE Design & Test
Partitioned instruction cache architecture for energy efficiency
ACM Transactions on Embedded Computing Systems (TECS)
Leakage Energy Management in Cache Hierarchies
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
A Holistic Approach to System Level Energy Optimization
PATMOS '00 Proceedings of the 10th International Workshop on Integrated Circuit Design, Power and Timing Modeling, Optimization and Simulation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Leakage power modeling and reduction with data retention
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Multi-objective design space exploration using genetic algorithms
Proceedings of the tenth international symposium on Hardware/software codesign
Energy efficient frequent value data cache design
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Xtream-Fit: an energy-delay efficient data memory subsystem for embedded media processing
Proceedings of the 40th annual Design Automation Conference
A DISE implementation of dynamic code decompression
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Deterministic Clock Gating for Microprocessor Power Reduction
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Positional adaptation of processors: application to energy reduction
Proceedings of the 30th annual international symposium on Computer architecture
Adaptive mode control: A static-power-efficient cache design
ACM Transactions on Embedded Computing Systems (TECS)
Reducing energy and delay using efficient victim caches
Proceedings of the 2003 international symposium on Low power electronics and design
Energy efficient D-TLB and data cache using semantic-aware multilateral partitioning
Proceedings of the 2003 international symposium on Low power electronics and design
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Near-Optimal Precharging in High-Performance Nanoscale CMOS Caches
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Design and analysis of low-power cache using two-level filter scheme
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Power-Aware Branch Prediction: Characterization and Design
IEEE Transactions on Computers
Dynamic Partitioning of Shared Cache Memory
The Journal of Supercomputing
An energy efficient cache memory architecture for embedded systems
Proceedings of the 2004 ACM symposium on Applied computing
A Self-Tuning Cache Architecture for Embedded Systems
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Coupling compiler-enabled and conventional memory accessing for energy efficiency
ACM Transactions on Computer Systems (TOCS)
A self-tuning cache architecture for embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
A Formal Approach to Frequent Energy Adaptations for Multimedia Applications
Proceedings of the 31st annual international symposium on Computer architecture
Location cache: a low-power L2 cache system
Proceedings of the 2004 international symposium on Low power electronics and design
Profile-based adaptation for cache decay
ACM Transactions on Architecture and Code Optimization (TACO)
Effective Adaptive Computing Environment Management via Dynamic Optimization
Proceedings of the international symposium on Code generation and optimization
Increasing Register File Immunity to Transient Errors
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
The implementation and evaluation of dynamic code decompression using DISE
ACM Transactions on Embedded Computing Systems (TECS)
IATAC: a smart predictor to turn-off L2 cache lines
ACM Transactions on Architecture and Code Optimization (TACO)
Skewed caches from a low-power perspective
Proceedings of the 2nd conference on Computing frontiers
A highly configurable cache for low energy embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Energy-efficient and high-performance instruction fetch using a block-aware ISA
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Snug set-associative caches: reducing leakage power while improving performance
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Fast and fair: data-stream quality of service
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Hibernator: helping disk arrays sleep through the winter
Proceedings of the twentieth ACM symposium on Operating systems principles
RECAST: Boosting Tag Line Buffer Coverage in Low-Power High-Level Caches "for Free"
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Thermal Management of On-Chip Caches Through Power Density Minimization
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Locality analysis to control dynamically way-adaptable caches
MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Dynamic Resizing of Superscalar Datapath Components for Energy Efficiency
IEEE Transactions on Computers
Compiler-directed high-level energy estimation and optimization
ACM Transactions on Embedded Computing Systems (TECS)
Analyzing data reuse for cache reconfiguration
ACM Transactions on Embedded Computing Systems (TECS)
Lazy BTB: reduce BTB energy consumption using dynamic profiling
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Cache size selection for performance, energy and reliability of time-constrained systems
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Power density minimization for highly-associative caches in embedded processors
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Impact of virtual execution environments on processor energy consumption and hardware adaptation
Proceedings of the 2nd international conference on Virtual execution environments
Block-aware instruction set architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Process variation aware cache leakage management
Proceedings of the 2006 international symposium on Low power electronics and design
Proceedings of the 20th annual international conference on Supercomputing
Yield-Aware Cache Architectures
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ALP: Efficient support for all levels of parallelism for complex media applications
ACM Transactions on Architecture and Code Optimization (TACO)
ACM Transactions on Architecture and Code Optimization (TACO)
The effect of temperature on cache size tuning for low energy embedded systems
Proceedings of the 17th ACM Great Lakes symposium on VLSI
Page mapping for heterogeneously partitioned caches: Complexity and heuristics
Journal of Embedded Computing - Cache exploitation in embedded systems
A cache design for high performance embedded systems
Journal of Embedded Computing - Cache exploitation in embedded systems
Unified microprocessor core storage
Proceedings of the 4th international conference on Computing frontiers
Exploiting program phase behavior for energy reduction on multi-configuration processors
Journal of Systems Architecture: the EUROMICRO Journal
Compiler-managed partitioned data caches for low power
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A one-shot configurable-cache tuner for improved energy and performance
Proceedings of the conference on Design, automation and test in Europe
DRIM: a low power dynamically reconfigurable instruction memory hierarchy for embedded systems
Proceedings of the conference on Design, automation and test in Europe
Cross-component energy management: Joint adaptation of processor and memory
ACM Transactions on Architecture and Code Optimization (TACO)
A self-tuning configurable cache
Proceedings of the 44th annual Design Automation Conference
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
A low power front-end for embedded processors using a block-aware instruction set
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Dynamic tag reduction for low-power caches in embedded systems with virtual memory
International Journal of Parallel Programming
Thermal management of on-chip caches through power density minimization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reducing leakage in power-saving capable caches for embedded systems by using a filter cache
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
A power-aware shared cache mechanism based on locality assessment of memory reference for CMPs
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Improving power efficiency of D-NUCA caches
ACM SIGARCH Computer Architecture News
Block remap with turnoff: a variation-tolerant cache design technique
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Reconfiguration management in the context of RTOS-based HW/SW embedded systems
EURASIP Journal on Embedded Systems - Operating System Support for Embedded Real-Time Applications
Phase-based cache reconfiguration for a highly-configurable two-level cache hierarchy
Proceedings of the 18th ACM Great Lakes symposium on VLSI
A dynamically reconfigurable cache for multithreaded processors
Journal of Embedded Computing - Issues in embedded single-chip multicore architectures
Sampling-based program locality approximation
Proceedings of the 7th international symposium on Memory management
Miss reduction in embedded processors through dynamic, power-friendly cache design
Proceedings of the 45th annual Design Automation Conference
Row/column redundancy to reduce SRAM leakage in presence of random within-die delay variation
Proceedings of the 13th international symposium on Low power electronics and design
Word-interleaved cache: an energy efficient data cache architecture
Proceedings of the 13th international symposium on Low power electronics and design
Multi-optimization power management for chip multiprocessors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Thrifty BTB: A comprehensive solution for dynamic power reduction in branch target buffers
Microprocessors & Microsystems
Specializing Cache Structures for High Performance and Energy Conservation in Embedded Systems
Transactions on High-Performance Embedded Architectures and Compilers I
RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Evaluating the effects of cache redundancy on profit
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Reconfigurable energy efficient near threshold cache architectures
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Data Cache Techniques to Save Power and Deliver High Performance in Embedded Systems
Transactions on High-Performance Embedded Architectures and Compilers II
Improving SMT performance: an application of genetic algorithms to configure resizable caches
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Exploration of 3D stacked L2 cache design for high performance and efficient thermal control
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Low-power inter-core communication through cache partitioning in embedded multiprocessors
Proceedings of the 22nd Annual Symposium on Integrated Circuits and System Design: Chip on the Dunes
The Design and Evaluation of a Selective Way Based Trace Cache
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
A DVS-based pipelined reconfigurable instruction memory
Proceedings of the 46th Annual Design Automation Conference
Efficient line buffer instruction cache scheme with prefetch
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Reducing peak power with a table-driven adaptive processor core
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Cache partitioning for energy-efficient and interference-free embedded multitasking
ACM Transactions on Embedded Computing Systems (TECS)
Exploiting memory soft redundancy for joint improvement of error tolerance and access efficiency
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamic capacity-speed tradeoffs in SMT processor caches
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Early-stage definition of LPX: a low power issue-execute processor
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
EUC'07 Proceedings of the 2007 international conference on Embedded and ubiquitous computing
Reducing energy in instruction caches by using multiple line buffers with prediction
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Off-chip memory bandwidth minimization through cache partitioning for multi-core platforms
Proceedings of the 47th Design Automation Conference
International Journal of High Performance Systems Architecture
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A reconfigurable cache memory with heterogeneous banks
Proceedings of the Conference on Design, Automation and Test in Europe
Dynamic, non-linear cache architecture for power-sensitive mobile processors
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Power and performance aware reconfigurable cache for CMPs
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Online cache modeling for commodity multicore processors
ACM SIGOPS Operating Systems Review
SRAM leakage reduction by row/column redundancy under random within-die delay variation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
On the interplay of loop caching, code compression, and cache configuration
Proceedings of the 16th Asia and South Pacific Design Automation Conference
ACM Transactions on Embedded Computing Systems (TECS)
DCG: deterministic clock-gating for low-power microprocessor design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2002 international symposium on low-power electronics and design (ISLPED)
A majority-based control scheme for way-adaptable caches
Facing the multicore-challenge
Power-aware dynamic cache partitioning for CMPs
Transactions on high-performance embedded architectures and compilers III
Transactions on high-performance embedded architectures and compilers III
A majority-based control scheme for way-adaptable caches
Facing the multicore-challenge
DEFCAM: A design and evaluation framework for defect-tolerant cache memories
ACM Transactions on Architecture and Code Optimization (TACO)
HC-Sim: a fast and exact l1 cache simulator with scratchpad memory co-simulation support
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A phase adaptive cache hierarchy for SMT processors
Microprocessors & Microsystems
The gradient-based cache partitioning algorithm
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Energy efficient united l2 cache design with instruction/data filter scheme
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Selective word reading for high performance and low power processor
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
First-level instruction cache design for reducing dynamic energy consumption
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Offline phase analysis and optimization for multi-configuration processors
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Exploring the potential of architecture-level power optimizations
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Performance/Thermal-Aware Design of 3D-Stacked L2 Caches for CMPs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An innovative instruction cache for embedded processors
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Recent thermal management techniques for microprocessors
ACM Computing Surveys (CSUR)
Link-time optimization for power efficiency in a tagless instruction cache
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
ADAM: an efficient data management mechanism for hybrid high and ultra-low voltage operation caches
Proceedings of the great lakes symposium on VLSI
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
Low power cache architectures with hybrid approach of filtering unnecessary way accesses
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Towards a performance- and energy-efficient data filter cache
Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems
Shrinking l1 instruction caches to improve energy: delay in SMT embedded processors
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Unifying Primary Cache, Scratch, and Register File Memories in a Throughput Processor
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores
MICROW '12 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture Workshops
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Efficient cache architectures for reliable hybrid voltage operation using EDC codes
Proceedings of the Conference on Design, Automation and Test in Europe
Run-time reconfiguration of expandable cache for embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamically reconfigurable hybrid cache: an energy-efficient last-level cache design
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Toward application-specific memory reconfiguration for energy efficiency
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
Application-aware adaptive cache architecture for power-sensitive mobile processors
ACM Transactions on Embedded Computing Systems (TECS)
Virtually split cache: An efficient mechanism to distribute instructions and data
ACM Transactions on Architecture and Code Optimization (TACO)
Thread-criticality aware dynamic cache reconfiguration in multi-core system
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.01 |
Increasing levels of microprocessor power dissipation call for new approaches at the architectural level that save energy by better matching of on-chip resources to application requirements. Selective cache ways provides the ability to disable a subset of the ways in a set associative cache during periods of modest cache activity, while the full cache may remain operational for more cache-intensive periods. Because this approach leverages the subarray partitioning that is already present for performance reasons, only minor changes to a conventional cache are required, and therefore, full-speed cache operation can be maintained. Furthermore, the tradeoff between performance and energy is flexible, and can be dynamically tailored to meet changing application and machine environmental conditions. We show that trading off a small performance degradation for energy savings can produce a significant reduction in cache energy dissipation using this approach.