The performance impact of I/O optimizations and disk improvements

  • Authors:
  • W. Hsu;A. J. Smith

  • Affiliations:
  • IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2004

Quantified Score

Hi-index 0.02

Visualization

Abstract

In this paper, we use real server and personal computer workloads to systematically analyze the true performance impact of various I/O optimization techniques, including read caching, sequential prefetching, opportunistic prefetching, write buffering, request scheduling, striping, and short-stroking. We also break down disk technology improvement into four basic effects--faster seeks, higher RPM, linear density improvement, and increase in track density--and analyze each separately to determine its actual benefit. In addition, we examine the historical rates of improvement and use the trends to project the effect of disk technology scaling. As part of this study, we develop a methodology for replaying real workloads that more accurately models I/O arrivals and that allows the I/O rate to be more realistically scaled than previously. We find that optimization techniques that reduce the number of physical I/Os are generally more effective than those that improve the efficiency in performing the I/Os. Sequential prefetching and write buffering are particularly effective, reducing the average read and write response time by about 50% and 90%, respectively. Our results suggest that a reliable method for improving performance is to use larger caches up to and even beyond 1% of the storage used. For a given workload, our analysis shows that disk technology improvement at the historical rate increases performance by about 8% per year if the disk occupancy rate is kept constant, and by about 15% per year if the same number of disks are used. We discover that the actual average seek time and rotational latency are, respectively, only about 35% and 60% of the specified values. We also observe that the disk head positioning time far dominates the data transfer time, suggesting that to effectively utilize the available disk bandwidth, data should be reorganized such that accesses become more sequential.