Availability in globally distributed storage systems

Authors:
Daniel Ford;François Labelle;Florentina I. Popovici;Murray Stokely;Van-Anh Truong;Luiz Barroso;Carrie Grimes;Sean Quinlan
Affiliations:
Google, Inc.;Google, Inc.;Google, Inc.;Google, Inc.;Dept. of Industrial Engineering and Operations Research, Columbia University and Google, Inc.;Google, Inc.;Google, Inc.;Google, Inc.
Venue:
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Year:
2010

Citing 24
Cited 36

Adventures in stochastic processes

Adventures in stochastic processes
Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Erasure Coding Vs. Replication: A Quantitative Comparison

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Reliability Mechanisms for Very Large Storage Systems

MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Introspective Failure Analysis: Avoiding Correlated Failures in Peer-to-Peer Systems

SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Disk Scrubbing in Large Archival Storage Systems

MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems

ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction

FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Designing for Disasters

FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
A fresh look at the reliability of long-term digital storage

Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Autopilot: automatic data center management

ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
An analysis of latent sector errors in disk drives

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Efficient replica maintenance for distributed storage systems

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Exploiting availability prediction in distributed systems

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Subtleties in tolerating correlated failures in wide-area storage systems

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Bigtable: a distributed storage system for structured data

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Are disks the dominant contributor for storage failures?: a comprehensive study of storage subsystem failure characteristics

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
An analysis of data corruption in the storage stack

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
DRAM errors in the wild: a large-scale field study

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
GFS: evolution on fast-forward

Communications of the ACM
High availability in DHTs: erasure coding vs. replication

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems

Impact of recent hardware and software trends on high performance transaction processing and analytics

TPCTC'10 Proceedings of the Second TPC technology conference on Performance evaluation, measurement and characterization of complex systems
In search of I/O-optimal recovery from disk failures

HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
Understanding network failures in data centers: measurement, analysis, and implications

Proceedings of the ACM SIGCOMM 2011 conference
Small cache, big effect: provable load balancing for randomly partitioned cluster services

Proceedings of the 2nd ACM Symposium on Cloud Computing
Automatic management of partitioned, replicated search services

Proceedings of the 2nd ACM Symposium on Cloud Computing
Thialfi: a client notification service for internet-scale applications

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
PREFAIL: a programmable tool for multiple-failure injection

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Modeling and tolerating heterogeneous failures in large parallel systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Enhancing application robustness in cloud data centers

Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Long-term availability prediction for groups of volunteer resources

Journal of Parallel and Distributed Computing
Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads

FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Surviving failures in bandwidth-constrained datacenters

Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Erasure coding in windows azure storage

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Failure-aware resource provisioning for hybrid Cloud infrastructure

Journal of Parallel and Distributed Computing
MyCassandra: a cloud storage supporting both read heavy and write heavy workloads

Proceedings of the 5th Annual International Systems and Storage Conference
Surviving failures in bandwidth-constrained datacenters

ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Robust Redundancy Scheme for the Repair Process: Hierarchical Codes in the Bandwidth-Limited Systems

Journal of Grid Computing
Themis: an I/O-efficient MapReduce

Proceedings of the Third ACM Symposium on Cloud Computing
Theia: visual signatures for problem diagnosis in large hadoop clusters

lisa'12 Proceedings of the 26th international conference on Large Installation System Administration: strategies, tools, and techniques
Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems

ACM Transactions on Storage (TOS)
Cloud API issues: an empirical study and impact

Proceedings of the 9th international ACM Sigsoft conference on Quality of software architectures
In-network redundancy generation for opportunistic speedup of data backup

Future Generation Computer Systems
Juggling the Jigsaw: towards automated problem inference from network trouble tickets

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Robustness in the Salus scalable block store

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
XORing elephants: novel erasure codes for big data

Proceedings of the VLDB Endowment
Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions

Journal of Grid Computing
Limplock: understanding the impact of limpware on scale-out cloud systems

Proceedings of the 4th annual Symposium on Cloud Computing
An untold story of redundant clouds: making your service deployment truly reliable

Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
Effect of codeword placement on the reliability of erasure coded data storage systems

QEST'13 Proceedings of the 10th international conference on Quantitative Evaluation of Systems
A solution to the network challenges of data recovery in erasure-coded distributed storage systems: a study on the Facebook warehouse cluster

HotStorage'13 Proceedings of the 5th USENIX conference on Hot Topics in Storage and File Systems
Copysets: reducing the frequency of data loss in cloud storage

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
On the efficiency of durable state machine replication

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
On the feasibility of completely wirelesss datacenters

IEEE/ACM Transactions on Networking (TON)
Warming up storage-level caches with bonfire

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Parity logging with reserved space: towards efficient updates and recovery in erasure-coded clustered storage

FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Analysis of HDFS under HBase: a facebook messages case study

FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Highly available cloud storage is often implemented with complex, multi-tiered distributed systems built on top of clusters of commodity servers and disk drives. Sophisticated management, load balancing and recovery techniques are needed to achieve high performance and availability amidst an abundance of failure sources that include software, hardware, network connectivity, and power issues. While there is a relative wealth of failure studies of individual components of storage systems, such as disk drives, relatively little has been reported so far on the overall availability behavior of large cloudbased storage services. We characterize the availability properties of cloud storage systems based on an extensive one year study of Google's main storage infrastructure and present statistical models that enable further insight into the impact of multiple design choices, such as data placement and replication strategies. With these models we compare data availability under a variety of system parameters given the real patterns of failures observed in our fleet.