Design tradeoffs for software-managed TLBs

Authors:
David Nagle;Richard Uhlig;Tim Stanley;Stuart Sechrest;Trevor Mudge;Richard Brown
Affiliations:
-;-;-;-;-;-
Venue:
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Year:
1993

Citing 13
Cited 36

A fast file system for UNIX

ACM Transactions on Computer Systems (TOCS)
Cache performance of operating system and multiprogramming workloads

ACM Transactions on Computer Systems (TOCS)
Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Scalable, Secure, and Highly Available Distributed File Access

Computer
The interaction of architecture and operating system design

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
MIPS RISC architectures

MIPS RISC architectures
A simulation based study of TLB performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Tradeoffs in supporting two page sizes

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Performance of the VAX-11/780 translation buffer: simulation and measurement

ACM Transactions on Computer Systems (TOCS)
Translation buffer performance in a UNIX enviroment

ACM SIGARCH Computer Architecture News
Data Movement in Kernelized Systems

Proceedings of the Workshop on Micro-kernels and Other Kernel Architectures
Performance of a Software MPEG Video Decoder

Performance of a Software MPEG Video Decoder

The impact of operating system structure on memory system performance

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Kernel-based memory simulation (extended abstract)

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Optimal allocation of on-chip memory for multiple-API operating systems

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Trap-driven simulation with Tapeworm II

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The operating system kernel as a secure programmable machine

ACM SIGOPS Operating Systems Review
Guarded page tables on Mips R4600 or an exercise in architecture-dependent micro optimization

ACM SIGOPS Operating Systems Review
Exokernel: an operating system architecture for application-level resource management

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Instruction fetching: coping with code bloat

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The case for SRAM main memory

ACM SIGARCH Computer Architecture News
Trap-driven memory simulation with Tapeworm II

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Active memory: a new abstraction for memory system simulation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Constructing instruction traces from cache-filtered address traces (CITCAT)

ACM SIGARCH Computer Architecture News
Trace-driven memory simulation: a survey

ACM Computing Surveys (CSUR)
NStrace: a bus-driven instruction trace tool for PowerPC microprocessors

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Options for dynamic address translation in COMAs

Proceedings of the 25th annual international symposium on Computer architecture
Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A look at several memory management units, TLB-refill mechanisms, and page table organizations

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Tolerating late memory traps in ILP processors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Uniprocessor Virtual Memory without TLBs

IEEE Transactions on Computers
The operating system kernel as a secure programmable machine

EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
Facilitating level three cache studies using set sampling

Proceedings of the 32nd conference on Winter simulation
Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Going the distance for TLB prefetching: an application-driven study

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers

HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Tradeoffs in the Design of Single Chip Multiprocessors

PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Trace-Driven Memory Simulation: A Survey

Performance Evaluation: Origins and Directions
Tolerating Late Memory Traps in Dynamically Scheduled Processors

IEEE Transactions on Computers
Seven-O'Clock: A New Distributed GVT Algorithm Using Network Atomic Operations

Proceedings of the 19th Workshop on Principles of Advanced and Distributed Simulation
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
In-Line Interrupt Handling and Lock-Up Free Translation Lookaside Buffers (TLBs)

IEEE Transactions on Computers
Memory behavior of an X11 window system

WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Software prefetching and caching for translation lookaside buffers

OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Inter-core cooperative TLB for chip multiprocessors

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
CoLT: Coalesced Large-Reach TLBs

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Large-reach memory management unit caches

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Cache isolation for virtualization of mixed general-purpose and real-time systems

Journal of Systems Architecture: the EUROMICRO Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

An increasing number of architectures provide virtual memory support through software-managed TLBs. However, software management can impose considerable penalties, which are highly dependent on the operating system's structure and its use of virtual memory. This work explores software-managed TLB design tradeoffs and their interaction with a range of operating systems including monolithic and microkernel designs. Through hardware monitoring and simulations, we explore TLB performance for benchmarks running on a MIPS R2000-based workstation running Ultrix, OSF/1, and three versions of mach 3.0.Results: New operating systems are changing the relative frequency of different types of TLB misses, some of which may not be efficiently handled by current architectures. For the same application binaries, total TLB service time varies by as much as an order of magnitude under different operating systems. Reducing the handling cost for kernel TLB misses reduces total TLB service time up to 40%. For TLBs between 32 and 128 slots, each doubling of the TLB size reduces total TLB service time up to 50%.