A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications

Authors:
Hong Tang;Aziz Gulbeden;Jingyu Zhou;William Strathearn;Tao Yang;Lingkun Chu
Affiliations:
Ask Jeeves;University of California at Santa Barbara;University of California at Santa Barbara;University of California at Santa Barbara;Ask Jeeves and University of California at Santa Barbara;Ask Jeeves
Venue:
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Year:
2004

Citing 39
Cited 9

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
Measurements of a distributed file system

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Disconnected operation in the Coda File System

ACM Transactions on Computer Systems (TOCS)
The Zebra striped network file system

ACM Transactions on Computer Systems (TOCS)
Serverless network file systems

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Managing update conflicts in Bayou, a weakly connected replicated storage system

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The dangers of replication and a solution

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Petal: distributed virtual disks

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Replication, consistency, and practicality: are these mutually exclusive?

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service

Proceedings of the seventeenth ACM symposium on Operating systems principles
File system usage in Windows NT 4.0

Proceedings of the seventeenth ACM symposium on Operating systems principles
Deciding when to forget in the Elephant file system

Proceedings of the seventeenth ACM symposium on Operating systems principles
A distributed file service based on optimistic concurrency control

Proceedings of the tenth ACM symposium on Operating systems principles
Network attached storage architecture

Communications of the ACM
Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery

Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery
OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Distributed object location in a dynamic network

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Compact, adaptive placement schemes for non-uniform requirements

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
Titan: A High-Performance Remote Sensing Database

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Harvest, Yield, and Scalable Tolerant Systems

HOTOS '99 Proceedings of the The Seventh Workshop on Hot Topics in Operating Systems
Reliability Mechanisms for Very Large Storage Systems

MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
A Fast Algorithm for Online Placement and Reorganization of Replicated Data

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
The Swarm Scalable Storage System

ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
SWIFT: USING DISTRIBUTED DISK STRIPING TO PROVIDE HIGH I/O DATA RATES

SWIFT: USING DISTRIBUTED DISK STRIPING TO PROVIDE HIGH I/O DATA RATES
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
An Efficient Data Location Protocol for Self.organizing Storage Clusters

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Metadata Efficiency in Versioning File Systems

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Interposed request routing for scalable network storage

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Design and evaluation of a continuous consistency model for replicated services

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Neptune: scalable replication management and programming support for cluster-based network services

USITS'01 Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems - Volume 3
Why do internet services fail, and what can be done about it?

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Embedded inodes and explicit grouping: exploiting disk bandwidth for small files

ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Berkeley DB

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference

CRUSH: controlled, scalable, decentralized placement of replicated data

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Ceph: a scalable, high-performance distributed file system

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
RADOS: a scalable, reliable storage service for petabyte-scale storage clusters

PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
HYDRAstor: a Scalable Secondary Storage

FAST '09 Proccedings of the 7th conference on File and storage technologies
Access-pattern and bandwidth aware file replication algorithm in a grid environment

GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
A DSM-based fragmented data sharing framework for grids

Future Generation Computer Systems
A Scalable Message Passing Interface Implementation of an Ad-Hoc Parallel I/o system

International Journal of High Performance Computing Applications
A load-aware data placement policy on cluster file system

NPC'11 Proceedings of the 8th IFIP international conference on Network and parallel computing
A high performance peer to cloud and peer model augmented with hierarchical secure communications

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cluster-based storage systems are popular for data-intensive applications and it is desirable yet challenging to provide incremental expansion and high availability while achieving scalability and strong consistency. This paper presents the design and implementation of a self-organizing storage cluster called Sorrento, which targets data-intensive workload with highly parallel requests and low write-sharing patterns. Sorrento automatically adapts to storage node joins and departures, and the system can be configured and maintained incrementally without interrupting its normal operation. Data location information is distributed across storage nodes using consistent hashing and the location protocol differentiates small and large data objects for access efficiency. It adopts versioning to achieve single-file serializability and replication consistency. In this paper, we present experimental results to demonstrate features and performance of Sorrento using microbenchmarks, application benchmarks, and application trace replay.