ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
An in-cache address translation mechanism
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Inexpensive implementations of set-associativity
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Virtual memory primitives for user programs
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The SPARC architecture manual: version 8
The SPARC architecture manual: version 8
COMPCON '92 Proceedings of the thirty-seventh international conference on COMPCON
Performance of the VAX-11/780 translation buffer: simulation and measurement
ACM Transactions on Computer Systems (TOCS)
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
A note on the calculation of average working set size
Communications of the ACM
The working set model for program behavior
Communications of the ACM
Translation buffer performance in a UNIX enviroment
ACM SIGARCH Computer Architecture News
The multics system: an examination of its structure
The multics system: an examination of its structure
Architecture support for single address space operating systems
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Design tradeoffs for software-managed TLBs
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Design tradeoffs for software-managed TLBs
ACM Transactions on Computer Systems (TOCS)
Surpassing the TLB performance of superpages with less operating system support
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A new page table for 64-bit address spaces
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Reducing TLB and memory overhead using online superpage promotion
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Reducing network latency using subpages in a global memory environment
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Increasing TLB reach using superpages backed by shadow memory
Proceedings of the 25th annual international symposium on Computer architecture
Options for dynamic address translation in COMAs
Proceedings of the 25th annual international symposium on Computer architecture
Tolerating late memory traps in ILP processors
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Boosting superpage utilization with the shadow memory and the partial-subblock TLB
Proceedings of the 14th international conference on Supercomputing
Proceedings of the 27th annual international symposium on Computer architecture
IEEE Transactions on Computers
Uniprocessor Virtual Memory without TLBs
IEEE Transactions on Computers
Operating system performance and large servers
EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
A banked-promotion translation lookaside buffer system
Journal of Systems Architecture: the EUROMICRO Journal
Practical, transparent operating system support for superpages
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
Tolerating Late Memory Traps in Dynamically Scheduled Processors
IEEE Transactions on Computers
Concurrent Support of Multiple Page Sizes on a Skewed Associative TLB
IEEE Transactions on Computers
Practical, transparent operating system support for superpages
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Multiple Page Size Modeling and Optimization
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Implementation of multiple pagesize support in HP-UX
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Using 4KB page size for virtual memory is obsolete
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Multi-view memory to support OS locking for transaction systems
IDEAS'97 Proceedings of the 1997 international conference on International database engineering and applications symposium
A minimalist cache coherent MPSoC designed for FPGAs
International Journal of High Performance Systems Architecture
Performance characteristics of explicit superpage support
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Reducing memory reference energy with opportunistic virtual caching
Proceedings of the 39th Annual International Symposium on Computer Architecture
CoLT: Coalesced Large-Reach TLBs
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient virtual memory for big memory servers
Proceedings of the 40th Annual International Symposium on Computer Architecture
Revisiting memory management on virtualized environments
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.01 |
As computer system main memories get larger and processor cycles-per-instruction (CPIs) get smaller, the time spent in handling translation lookaside buffer (TLB) misses could become a performance bottleneck. We explore relieving this bottleneck by (a) increasing the page size and (b) supporting two page sizes.We discuss how to build a TLB to support two page sizes and examine both alternatives experimentally with a dozen uniprogrammed, user-mode traces for the SPARC architecture. Our results show that increasing the page size to 32KB causes both a significant increase in average working set size (e.g., 60%) and a significant reduction in the TLB's contribution to CPI, CPITLB, (namely a factor of eight) compared to using 4KB pages. Results for using two page sizes, 4KB and 32KB pages, on the other hand, show a small increase in working set size (about 10%) and variable decrease in CPITLB, (from negligible to as good as found with the 32KB page size). CPITLB when using two page sizes is consistently better for fully associative TLBs than for set-associative ones.Our results are preliminary, however, since (a) our traces do not include multiprogramming or operating system behavior, and (b) our page-size assignment policy may not reflect a real operating system's policy.