An in-cache address translation mechanism
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Coherency for multiprocessor virtual address caches
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Organization and performance of a two-level virtual-real cache hierarchy
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Tradeoffs in supporting two page sizes
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Lightweight shared objects in a 64-bit operating system
OOPSLA '92 conference proceedings on Object-oriented programming systems, languages, and applications
Architecture support for single address space operating systems
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Evolution of the PowerPC Architecture
IEEE Micro
Reducing TLB power requirements
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Uniprocessor Virtual Memory without TLBs
IEEE Transactions on Computers
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Generating physical addresses directly for saving instruction TLB energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Itanium 2 Processor Microarchitecture
IEEE Micro
Fast Address-Space Switching on the StrongARM SA-1100 Processor
ACAC '00 Proceedings of the 5th Australasian Computer Architecture Conference
U-cache: a cost-effective solution to synonym problem
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Energy efficient D-TLB and data cache using semantic-aware multilateral partitioning
Proceedings of the 2003 international symposium on Low power electronics and design
Computer Organization and Design
Computer Organization and Design
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Thermal analysis of a 3D die-stacked high-performance microprocessor
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
Reducing energy of virtual cache synonym lookup using bloom filters
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Heterogeneously tagged caches for low-power embedded systems with virtual memory support
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Accelerating two-dimensional page walks for virtualized systems
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
The Synonym Lookaside Buffer: A Solution to the Synonym Problem in Virtual Caches
IEEE Transactions on Computers
Two new techniques integrated for energy-efficient TLB design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
SpecTLB: a mechanism for speculative address translation
Proceedings of the 38th annual international symposium on Computer architecture
ACM SIGARCH Computer Architecture News
Reducing Data TLB Power via Compiler-Directed Address Generation
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
CoLT: Coalesced Large-Reach TLBs
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient virtual memory for big memory servers
Proceedings of the 40th Annual International Symposium on Computer Architecture
A new perspective for efficient virtual-cache coherence
Proceedings of the 40th Annual International Symposium on Computer Architecture
TLC: a tag-less cache for reducing dynamic first level cache energy
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Designing a practical data filter cache to improve both energy efficiency and performance
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Most modern cores perform a highly-associative transaction look aside buffer (TLB) lookup on every memory access. These designs often hide the TLB lookup latency by overlapping it with L1 cache access, but this overlap does not hide the power dissipated by TLB lookups. It can even exacerbate the power dissipation by requiring higher associativity L1 cache. With today's concern for power dissipation, designs could instead adopt a virtual L1 cache, wherein TLB access power is dissipated only after L1 cache misses. Unfortunately, virtual caches have compatibility issues, such as supporting writeable synonyms and x86's physical page table walker. This work proposes an Opportunistic Virtual Cache (OVC) that exposes virtual caching as a dynamic optimization by allowing some memory blocks to be cached with virtual addresses and others with physical addresses. OVC relies on small OS changes to signal which pages can use virtual caching (e.g., no writeable synonyms), but defaults to physical caching for compatibility. We show OVC's promise with analysis that finds virtual cache problems exist, but are dynamically rare. We change 240 lines in Linux 2.6.28 to enable OVC. On experiments with Parsec and commercial workloads, the resulting system saves 94-99% of TLB lookup energy and nearly 23% of L1 cache dynamic lookup energy.