A simulation based study of TLB performance
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache design trade-offs for power and performance optimization: a case study
ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Reducing TLB power requirements
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor
Digital Technical Journal
Decoupling local variable accesses in a wide-issue superscalar processor
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Filtering Memory References to Increase Energy Efficiency
IEEE Transactions on Computers
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Region-based caching: an energy-delay efficient memory architecture for embedded processors
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
L1 data cache decomposition for energy efficiency
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A stateless, content-directed data prefetching mechanism
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Pointer cache assisted prefetching
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Generating physical addresses directly for saving instruction TLB energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A Low Power TLB Structure for Embedded Systems
IEEE Computer Architecture Letters
Compiler-directed code restructuring for reducing data TLB energy
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Optimizing instruction TLB energy using software and hardware techniques
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An energy efficient TLB design methodology
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Synonymous address compaction for energy reduction in data TLB
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Entropy-based low power data TLB design
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Locality-driven architectural cache sub-banking for leakage energy reduction
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Reducing cache energy consumption by tag encoding in embedded processors
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Direct address translation for virtual memory in energy-efficient embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Specializing Cache Structures for High Performance and Energy Conservation in Embedded Systems
Transactions on High-Performance Embedded Architectures and Compilers I
Reducing L1 caches power by exploiting software semantics
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Reducing memory reference energy with opportunistic virtual caching
Proceedings of the 39th Annual International Symposium on Computer Architecture
Exploiting semantics of virtual memory to improve the efficiency of the on-chip memory system
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Hi-index | 0.00 |
The memory subsystem, including address translations and cache accesses, consumes a major portion of the overall energy on a processor. In this paper, we address the memory energy issues by using a streamlined architectural partitioning technique that effectively reduces energy consumption in the memory subsystem without compromising performance. It is achieved by decoupling the d-TLB lookups and the data cache accesses, based on the semantic regions defined by programming languages and software convention, into discrete reference substreams --- stack, global static, and heap. Their unique access behaviors and locality characteristics are analyzed and exploited for power reduction. Our results show that an average of 35% energy can be reduced in the d-TLB and the data cache. Furthermore, an average of 46% energy can be saved by selectively multi-porting the semantic-aware d-TLBs and data caches against their monolithic counterparts.