Self-similarity in file systems
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
A large-scale study of file-system contents
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
File system usage in Windows NT 4.0
Proceedings of the seventeenth ACM symposium on Operating systems principles
A comparison of file system workloads
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
A five-year study of file-system metadata
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Measurement and analysis of large-scale network file system workloads
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
HYDRAstor: a Scalable Secondary Storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Lithium: virtual machine storage for the cloud
Proceedings of the 1st ACM symposium on Cloud computing
I/O deduplication: utilizing content similarity to improve I/O performance
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Evaluating performance and energy in file system server workloads
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Everest: scaling down peak loads through I/O off-loading
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Characterizing datasets for data deduplication in backup applications
IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
Nectar: automatic management of data and computation in datacenters
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
A study of practical deduplication
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Capo: recapitulating storage for virtual desktops
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Passive NFS tracing of email and research workloads
FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Pesto: online storage performance management in virtualized datacenters
Proceedings of the 2nd ACM Symposium on Cloud Computing
Live deduplication storage of virtual machine images in an open-source cloud
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Characteristics of backup workloads in production systems
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Demand based hierarchical QoS using storage resource pools
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Data Centers in the Cloud: A Large Scale Performance Study
CLOUD '12 Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing
Cooperative Storage-Level De-duplication for I/O Reduction in Virtualized Data Centers
MASCOTS '12 Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Romano: autonomous storage management using performance prediction in multi-tenant datacenters
Proceedings of the Third ACM Symposium on Cloud Computing
State-of-the-practice in data center virtualization: Toward a better understanding of VM usage
DSN '13 Proceedings of the 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
Hi-index | 0.00 |
Virtualization is the ubiquitous way to provide computation and storage services to datacenter end-users. Guaranteeing sufficient data storage and efficient data access is central to all datacenter operations, yet little is known of the effects of virtualization on storage workloads. In this study, we collect and analyze field data from production datacenters that operate within the private cloud paradigm, during a period of three years. The datacenters of our study consist of 8,000 physical boxes, hosting over 90,000 VMs, which in turn use over 22 PB of storage. Storage data is analyzed from the perspectives of volume, velocity, and variety of storage demands on virtual machines and of their dependency on other resources. In addition to the growth rate and churn rate of allocated and used storage volume, the trace data illustrates the impact of virtualization and consolidation on the velocity of IO reads and writes, including IO deduplication ratios and peak load analysis of co-located VMs. We focus on a variety of applications which are roughly classified as app, web, database, file, mail, and print, and correlate their storage and IO demands with CPU, memory, and network usage. This study provides critical storage workload characterization by showing usage trends and how application types create storage traffic in large datacenters.