High-performance computer architecture
High-performance computer architecture
801 storage: architecture and programming
ACM Transactions on Computer Systems (TOCS)
Computer
Performance evaluation of a commercial cache-coherent shared memory multiprocessor
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A high speed superscalar PA-RISC processor
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Performance of the VAX-11/780 translation buffer: simulation and measurement
ACM Transactions on Computer Systems (TOCS)
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The TLB slice—a low-cost high-speed address translation mechanism
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Design tradeoffs for software-managed TLBs
ACM Transactions on Computer Systems (TOCS)
Optimal allocation of on-chip memory for multiple-API operating systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Sharing and protection in a single-address-space operating system
ACM Transactions on Computer Systems (TOCS) - Special issue on computer architecture
Guarded page tables on Mips R4600 or an exercise in architecture-dependent micro optimization
ACM SIGOPS Operating Systems Review
Virtual address translation for wide-address architectures
ACM SIGOPS Operating Systems Review
A new page table for 64-bit address spaces
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Exokernel: an operating system architecture for application-level resource management
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Trap-driven memory simulation with Tapeworm II
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Efficient Hardware Hashing Functions for High Performance Computers
IEEE Transactions on Computers
Increasing TLB reach using superpages backed by shadow memory
Proceedings of the 25th annual international symposium on Computer architecture
Options for dynamic address translation in COMAs
Proceedings of the 25th annual international symposium on Computer architecture
Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A look at several memory management units, TLB-refill mechanisms, and page table organizations
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
Proceedings of the 27th annual international symposium on Computer architecture
Uniprocessor Virtual Memory without TLBs
IEEE Transactions on Computers
Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Going the distance for TLB prefetching: an application-driven study
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Access Control Mechanisms in a Distributed, Persistent Memory System
IEEE Transactions on Parallel and Distributed Systems
Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Virtual memory on data diffusion architectures
Parallel Computing
Concurrent Support of Multiple Page Sizes on a Skewed Associative TLB
IEEE Transactions on Computers
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
In-Line Interrupt Handling and Lock-Up Free Translation Lookaside Buffers (TLBs)
IEEE Transactions on Computers
Deconstructing process isolation
Proceedings of the 2006 workshop on Memory system performance and correctness
Software prefetching and caching for translation lookaside buffers
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Implementation of multiple pagesize support in HP-UX
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Archipelago: trading address space for reliability and security
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Inter-core cooperative TLB for chip multiprocessors
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Revisiting hardware-assisted page walks for virtualized systems
Proceedings of the 39th Annual International Symposium on Computer Architecture
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Hi-index | 0.01 |
Virtual memory page translation tables provide mappings from virtual to physical addresses. When the hardware controlled Translation Lookaside Buffers (TLBs) do not contain a translation, these tables provide the translation. Approaches to the structure and management of these tables vary from full hardware implementations to complete software based algorithms.The size of the virtual address space used by processes is rapidly growing beyond 32 bits of address. As the utilized address space increases, new problems and issues surface. Traditional methods for managing the page translation tables are inappropriate for large address space architectures.The Hashed Page Table (HPT), described here, provides a very fast and space efficient translation table that reduces overhead by splitting TLB management responsibilities between hardware and software. Measurements demonstrate its applicability to a diverse range of operating systems and workloads and, in particular, to large virtual address space machines. In simulations of over 4 billion instructions, improvement of 5 to 10% were observed.