Informed prefetching and caching
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Operating system support for database management
Communications of the ACM
Pilot: an operating system for a personal computer
Communications of the ACM
MRTG: The Multi Router Traffic Grapher
LISA '98 Proceedings of the 12th Conference on Systems Administration
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Exploiting Gray-Box Knowledge of Buffer-Cache Management
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Transforming policies into mechanisms with infokernel
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Understanding the Linux Virtual Memory Manager
Understanding the Linux Virtual Memory Manager
RTG: A Scalable SNMP Statistics Architecture for Service Providers
LISA '02 Proceedings of the 16th USENIX conference on System administration
Driving by the rear-view mirror: managing a network with cricket
NETA'99 Proceedings of the 1st conference on Conference on Network Administration - Volume 1
dsync: efficient block-wise synchronization of multi-gigabyte binary data
LISA'13 Proceedings of the 27th international conference on Large Installation System Administration
Hi-index | 0.00 |
An operating system's readahead and buffer-cache behaviors can significantly impact application performance; most often these better performance, but occasionally they worsen it. To avoid unintended I/O latencies, many database systems sidestep these OS features by minimizing or eliminating application file I/O. However, network traffic measurement applications are commonly built instead atop a high-performance file-based database: the Round Robin Database (RRD) Tool. While RRD is successful, experience has led the network operations community to believe that its scalability is limited to tens of thousands of, or perhaps one hundred thousand, RRD files on a single system, keeping it from being used to measure the largest managed networks today. We identify the bottleneck responsible for that experience and present two approaches to overcome it. In this paper, we provide a method and tools to expose the readahead and buffer-cache behaviors that are otherwise hidden from the user. We apply our method to a very large network traffic measurement system that experiences scalability problems and determine the performance bottleneck to be unnecessary disk reads, and page faults, due to the default readahead behavior. We develop both a simulation and an analytical model of the performance-limiting page fault rate for RRD file updates. We develop and evaluate two approaches that alleviate this problem: application advice to disable readahead and application-level caching. We demonstrate their effectiveness by configuring and operating the world's largest Multi-Router Traffic Grapher (MRTG), with approximately 320,000 RRD files, and over half a million data points measured every five minutes. Conservatively, our techniques approximately triple the capacity of very large MRTG and other RRD-based measurement systems.