A transparent Linux super page kernel for Alpha, Sparc64 and IA32: reducing TLB misses of applications

Authors:
Naohiko Shimizu;Ken Takatori
Affiliations:
Tokai University, Kanagawa Japan;Tokai University, Kanagawa Japan
Venue:
ACM SIGARCH Computer Architecture News
Year:
2003

Citing 5
Cited 1

Surpassing the TLB performance of superpages with less operating system support

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Reducing TLB and memory overhead using online superpage promotion

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Reevaluating Online Superpage Promotion with Hardware Support

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
General purpose operating system support for multiple page sizes

ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Implementation of multiple pagesize support in HP-UX

ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference

A case for compiler-driven superpage allocation

Proceedings of the 47th Annual Southeast Regional Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern processors have various features for latency tolerance such as Hit-under-miss, Out-of-order execution, or Multi-threading. However, many processors must make a precise trap for a TLB miss, because they maintain the TLB with software and cannot distinguish the TLB scarcity from the page fault. It is very important for the application and/or the operating system to avoid the TLB misses as much as possible. Many processors have some super page features that extend the coverage of the TLB significantly, but few operating systems support it. In this paper, we present our implementation of Super Page Kernel for the Linux operating system along with its performance. We implemented Super Page for Alpha, Sparc64 and IA32. With super page kernel, a matrix transpose program runs at least 4 times faster on Alpha, at least 2 times faster on Sparc64, and generates interesting results on IA32 compared to the normal Linux kernel. In addition, we got SPEC CPU numbers that were about 10% higher than normal kernel on Alpha.