Translation lookaside buffer consistency: a software approach
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
A simulation based study of TLB performance
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Architectural support for translation table management in large address space machines
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Design tradeoffs for software-managed TLBs
ACM Transactions on Computer Systems (TOCS)
The impact of architectural trends on operating system performance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Performance of the VAX-11/780 translation buffer: simulation and measurement
Readings in computer architecture
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Reactive NUCA: near-optimal block placement and replication in distributed caches
Proceedings of the 36th annual international symposium on Computer architecture
Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Inter-core cooperative TLB for chip multiprocessors
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Translation caching: skip, don't walk (the page table)
Proceedings of the 37th annual international symposium on Computer architecture
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Shared last-level TLBs for chip multiprocessors
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
On the Performance of Tagged Translation Lookaside Buffers: A Simulation-Driven Analysis
MASCOTS '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems
DiDi: Mitigating the Performance Impact of TLB Shootdowns Using a Shared TLB Directory
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Hi-index | 0.00 |
Traversing the page table during virtual to physical address translation causes pipeline stalls when misses occur in the translation-lookaside buffer (TLB). State-of-the-art translation proposals typically optimize a single aspect of translation performance (e.g., translation sharing, context switch performance, etc.) with potential trade-offs of additional hardware complexity, increased translation latency, or reduced scalability. In this article, we propose the partial sharing TLB (PS-TLB), a fast and scalable solution that reduces off-chip translation misses without sacrificing the timing-critical requirement of on-chip translation. We introduce the partial sharing buffer (PSB) which leverages application page sharing characteristics using minimal additional hardware resources. Compared to the leading TLB proposal that leverages sharing, PS-TLB provides a more than 45% improvement in translation latency with a 9% application speedup while using fewer storage resources. In addition, the page classification and PS-TLB architecture provide further optimizations including an over 30% reduction of interprocessor interrupts for coherence, and reduced context switch misses with fewer resources compared with existing methods.