Robust benchmarking for archival storage tiers

Authors:
DongJin Lee Lee;Michael O'Sullivan;Cameron Walker;Monique MacKenzie
Affiliations:
The University of Auckland, Auckland, New Zealand;The University of Auckland, Auckland, New Zealand;The University of Auckland, Auckland, New Zealand;The University of St. Andrews, St Andrews, United Kingdom
Venue:
Proceedings of the sixth workshop on Parallel Data Storage
Year:
2011

Citing 6
Cited 3

A five-year study of file-system metadata

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Ceph: a scalable, high-performance distributed file system

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
A nine year study of file system and storage benchmarking

ACM Transactions on Storage (TOS)
Generating realistic impressions for file-system benchmarking

FAST '09 Proccedings of the 7th conference on File and storage technologies
A study of practical deduplication

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Efficiently identifying working sets in block I/O streams

Proceedings of the 4th Annual International Conference on Systems and Storage

Benchmarking and modeling disk-based storage tiers for practical storage design

Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems
Designing data storage tier using Integer Programing

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Benchmarking and modeling disk-based storage tiers for practical storage design

ACM SIGMETRICS Performance Evaluation Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

Until recently archival storage tiers have consisted of tape-based devices with a large storage capacity, but limited I/O performance for data retrieval. However, the growing capacity and shrinking cost of disk-based devices means that disk-based systems are now a realistic option for enterprise archival storage tiers. Given the increasingly diverse options for archival storage, robust benchmarking of possible technologies for archival storage tiers is vital for reducing risk before deployment. This paper investigates benchmarks that utilize archival workloads developed from an analysis of historical file size distributions. These benchmarks not only provide more appropriate measurements of system performance as an archive than traditional approaches, but we also incorporate the variation observed in the historical data to provide "best" and "worst" case workloads for benchmarking. By considering not only the usual workload, but also workloads at either end of the archival workload spectrum, our benchmarking is robust. It provides measures of performance for the envelope of typical archival workload observed from empirical data.