IEEE Transactions on Computers
Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
ACM SIGARCH Computer Architecture News
Practical, transparent operating system support for superpages
ACM SIGOPS Operating Systems Review - OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation
Practical, transparent operating system support for superpages
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Advanced non-distributed operating systems course
ACM SIGCSE Bulletin
Efficient address remapping in distributed shared-memory systems
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 2006 workshop on Memory system performance and correctness
A case for compiler-driven superpage allocation
Proceedings of the 47th Annual Southeast Regional Conference
CoLT: Coalesced Large-Reach TLBs
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Abstract: Typical translation lookaside buffers (TLBs)can map a far smaller region of memory than applicatio footprints demand, and the cost of handling TLB misses therefore limits the performance of a increasing number of applications. This bottleneck can be mitigated by the use of superpages, multiple adjacent virtual memory pages that can be mapped with a single TLB entry, that extend TLB reach without significantly increasing size or cost. We analyze hardware/software tradeoffs for dynamically creating superpages. This study extends previous work by using execution-driven simulation to compare creating superpages via copying with remapping pages within the memory controller, and by examining how the tradeoffs change when moving from a single-issue to a superscalar processor model. We find that remapping-based promotion outperforms copying-based promotion, often significantly. Copying-based promotion is slightly more effective on superscalar processors than on single-issue processors, and the relative performance of remapping-based promotion on the two platforms is application-dependent.