Flat datacenter storage

Authors:
Edmund B. Nightingale;Jeremy Elson;Jinliang Fan;Owen Hofmann;Jon Howell;Yutaka Suzue
Affiliations:
Microsoft Research;Microsoft Research;Microsoft Research;Microsoft Research and University of Texas at Austin;Microsoft Research;Microsoft Research
Venue:
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Year:
2012

Citing 25
Cited 12

A measure of transaction processing power

Datamation
Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
The Zebra striped network file system

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Serverless network file systems

ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Petal: distributed virtual disks

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Frangipani: a scalable distributed file system

Proceedings of the sixteenth ACM symposium on Operating systems principles
Web caching with consistent hashing

WWW '99 Proceedings of the eighth international conference on World Wide Web
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Run-time adaptation in river

ACM Transactions on Computer Systems (TOCS)
GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Chain replication for supporting high throughput and availability

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Scalable performance of the Panasas parallel file system

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Towards a next generation data center architecture: scalability and commoditization

Proceedings of the ACM workshop on Programmable routers for extensible services of tomorrow
PortLand: a scalable fault-tolerant layer 2 data center network fabric

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
GFS: Evolution on Fast-forward

Queue - File Systems
Data center TCP (DCTCP)

Proceedings of the ACM SIGCOMM 2010 conference
Hedera: dynamic flow scheduling for data center networks

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
VL2: a scalable and flexible data center network

Communications of the ACM
The Hadoop Distributed File System

MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Reining in the outliers in map-reduce clusters using Mantri

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
TritonSort: a balanced large-scale sorting system

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Fast crash recovery in RAMCloud

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles

Beyond block I/O: implementing a distributed shared log in hardware

Proceedings of the 6th International Systems and Storage Conference
Leveraging endpoint flexibility in data-intensive clusters

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
The case for tiny tasks in compute clusters

HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
New wine in old skins: the case for distributed operating systems in the data center

Proceedings of the 4th Asia-Pacific Workshop on Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Timecard: controlling user-perceived delays in server-based mobile applications

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Trevi: watering down storage hotspots with cool fountain codes

Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
The quantcast file system

Proceedings of the VLDB Endowment
Shroud: ensuring private access to large-scale data in the data center

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Strata: scalable high-performance storage on virtualized non-volatile memory

FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Exalt: empowering researchers to evaluate large-scale storage systems

NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Blizzard: fast, cloud-scale block storage for cloud-oblivious applications

NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Flat Datacenter Storage (FDS) is a high-performance, fault-tolerant, large-scale, locality-oblivious blob store. Using a novel combination of full bisection bandwidth networks, data and metadata striping, and flow control, FDS multiplexes an application's large-scale I/O across the available throughput and latency budget of every disk in a cluster. FDS therefore makes many optimizations around data locality unnecessary. Disks also communicate with each other at their full bandwidth, making recovery from disk failures extremely fast. FDS is designed for datacenter scale, fully distributing metadata operations that might otherwise become a bottleneck. FDS applications achieve single-process read and write performance of more than 2GB/s. We measure recovery of 92GB data lost to disk failure in 6.2 s and recovery from a total machine failure with 655GB of data in 33.7 s. Application performance is also high: we describe our FDS-based sort application which set the 2012 world record for disk-to-disk sorting.