The performance impact of I/O optimizations and disk improvements

Authors:
W. Hsu;A. J. Smith
Affiliations:
IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120
Venue:
IBM Journal of Research and Development
Year:
2004

Citing 24
Cited 19

Disk cache—miss ratio analysis and design considerations

ACM Transactions on Computer Systems (TOCS)
Caching in the Sprite network file system

ACM Transactions on Computer Systems (TOCS)
Beating the I/O bottleneck: a case for log-structured file systems

ACM SIGOPS Operating Systems Review
Trace driven analysis of write caching policies for disks

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
RAID: high-performance, reliable secondary storage

ACM Computing Surveys (CSUR)
Scheduling algorithms for modern disk drives

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Striping in a RAID level 5 disk array

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Informed prefetching and caching

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The Rio file cache: surviving operating system crashes

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Trace-driven memory simulation: a survey

ACM Computing Surveys (CSUR)
Destage Algorithms for Disk Arrays with Nonvolatile Caches

IEEE Transactions on Computers
Automatic I/O hint generation through speculative execution

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Sequentiality and prefetching in database systems

ACM Transactions on Database Systems (TODS)
A virtual machine emulator for performance evaluation

Communications of the ACM
I/O reference behavior of production database workloads and the TPC benchmarks—an analysis at the logical level

ACM Transactions on Database Systems (TODS)
Numerical Recipes in C: The Art of Scientific Computing

Numerical Recipes in C: The Art of Scientific Computing
Complete Computer System Simulation: The SimOS Approach

IEEE Parallel & Distributed Technology: Systems & Technology
Starburst Mid-Flight: As the Dust Clears

IEEE Transactions on Knowledge and Data Engineering
Timing-Accurate Storage Emulation

FAST '02 Proceedings of the Conference on File and Storage Technologies
System-oriented evaluation of i/o subsystem performance

System-oriented evaluation of i/o subsystem performance
Dynamic locality improvement techniques for increasing effective storage performance

Dynamic locality improvement techniques for increasing effective storage performance
Characteristics of I/O traffic in personal computer and server workloads

IBM Systems Journal
Design and Implementation of Semi-preemptible IO

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Towards higher disk head utilization: extracting free bandwidth from busy disk drives

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4

Reconstruct versus read-modify writes in RAID

Information Processing Letters
On windows file access modes: a performance study

WISICT '05 Proceedings of the 4th international symposium on Information and communication technologies
The automatic improvement of locality in storage systems

ACM Transactions on Computer Systems (TOCS)
Building MEMS-based storage systems for streaming media

ACM Transactions on Storage (TOS)
Workloads (creation and use)

Communications of the ACM
EED: Energy Efficient Disk drive architecture

Information Sciences: an International Journal
Comparative evaluation of overlap strategies with study of I/O overlap in MPI-IO

ACM SIGOPS Operating Systems Review
Exploiting the performance gains of modern disk drives by enhancing data locality

Information Sciences: an International Journal
Advances in flash memory SSD technology for enterprise database applications

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Modelling Zoned RAID Systems Using Fork-Join Queueing Simulation

EPEW '09 Proceedings of the 6th European Performance Engineering Workshop on Computer Performance Engineering
Higher reliability redundant disk arrays: Organization, operation, and coding

ACM Transactions on Storage (TOS)
Reconstruct versus read-modify writes in RAID

Information Processing Letters
What is the future of disk drives, death or rebirth?

ACM Computing Surveys (CSUR)
Survey and analysis of disk scheduling methods

ACM SIGARCH Computer Architecture News
Hint controlled distribution with parallel file systems

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Modern B-Tree Techniques

Foundations and Trends in Databases
Distributed memory virtualization with the use of SDDSfL

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Improving disk I/O performance in a virtualized system

Journal of Computer and System Sciences
Characterization of incremental data changes for efficient data protection

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper, we use real server and personal computer workloads to systematically analyze the true performance impact of various I/O optimization techniques, including read caching, sequential prefetching, opportunistic prefetching, write buffering, request scheduling, striping, and short-stroking. We also break down disk technology improvement into four basic effects--faster seeks, higher RPM, linear density improvement, and increase in track density--and analyze each separately to determine its actual benefit. In addition, we examine the historical rates of improvement and use the trends to project the effect of disk technology scaling. As part of this study, we develop a methodology for replaying real workloads that more accurately models I/O arrivals and that allows the I/O rate to be more realistically scaled than previously. We find that optimization techniques that reduce the number of physical I/Os are generally more effective than those that improve the efficiency in performing the I/Os. Sequential prefetching and write buffering are particularly effective, reducing the average read and write response time by about 50% and 90%, respectively. Our results suggest that a reliable method for improving performance is to use larger caches up to and even beyond 1% of the storage used. For a given workload, our analysis shows that disk technology improvement at the historical rate increases performance by about 8% per year if the disk occupancy rate is kept constant, and by about 15% per year if the same number of disks are used. We discover that the actual average seek time and rotational latency are, respectively, only about 35% and 60% of the specified values. We also observe that the disk head positioning time far dominates the data transfer time, suggesting that to effectively utilize the available disk bandwidth, data should be reorganized such that accesses become more sequential.