The interaction of architecture and operating system design
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A simulation based study of TLB performance
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Design tradeoffs for software-managed TLBs
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Architectural support for translation table management in large address space machines
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Surpassing the TLB performance of superpages with less operating system support
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A software-controlled prefetching mechanism for software-managed TLBs
Microprocessing and Microprogramming
Performance of the VAX-11/780 translation buffer: simulation and measurement
ACM Transactions on Computer Systems (TOCS)
High-bandwidth address translation for multiple-issue processors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Using the SimOS machine simulator to study complex computer systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Increasing TLB reach using superpages backed by shadow memory
Proceedings of the 25th annual international symposium on Computer architecture
A look at several memory management units, TLB-refill mechanisms, and page table organizations
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Optimizing the idle task and other MMU tricks
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Prefetching Using Markov Predictors
IEEE Transactions on Computers - Special issue on cache memory and related problems
Online superpage promotion revisited (poster session)
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Proceedings of the 27th annual international symposium on Computer architecture
Integrating superscalar processor components to implement register caching
ICS '01 Proceedings of the 15th international conference on Supercomputing
Going the distance for TLB prefetching: an application-driven study
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Workload Characterization for Computer System Design
Workload Characterization for Computer System Design
Cache performance for selected SPEC CPU2000 benchmarks
ACM SIGARCH Computer Architecture News
Reevaluating Online Superpage Promotion with Hardware Support
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Use of superpages and subblocking in the address translation hierarchy
Use of superpages and subblocking in the address translation hierarchy
Going the distance for TLB prefetching: an application-driven study
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Practical, transparent operating system support for superpages
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
Practical, transparent operating system support for superpages
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Proceedings of the 2006 workshop on Memory system performance and correctness
SPEC CPU2006 sensitivity to memory page sizes
ACM SIGARCH Computer Architecture News
Performance Characterization of Itanium® 2-Based Montecito Processor
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Using 4KB page size for virtual memory is obsolete
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Finding representative workloads for computer system design
Finding representative workloads for computer system design
Inter-core cooperative TLB for chip multiprocessors
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
IOMMU: strategies for mitigating the IOTLB bottleneck
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Despite the numerous optimization and evaluation studies that have been conducted with TLBs over the years, there is still a deficiency in an indepth understanding of TLB characteristics from an application angle. This paper presents a detailed characterization study of the TLB behavior of the SPEC CPU2000 benchmark suite. The contributions of this work are in identifying important application characteristics for TLB studies, quantifying the SPEC2000 application behavior for these characteristics, as well as making pronouncements and suggestions for future research based on these results.Around one-fourth of the SPEC2000 applications (ammp, apsi, galgel, lucas, mcf, twolf and vpr) have significant TLB missrates. Both capacity and associativity are influencing factors on miss-rates, though they do not necessarily go hand-in-hand. Multi-level TLBs are definitely useful for these applications in cutting down access times without significant miss rate degradation. Superpaging to combine TLB entries may not be rewarding for many of these applications. Software management of TLBs in terms of determining what entries to prefetch, what entries to replace, and what entries to pin has a lot of potential to cut down miss rates considerably. Specifically, the potential benefits of prefetching TLB entries is examined, and Distance Prefetching is shown to give good prediction accuracy for these applications.