A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications

  • Authors:
  • Hong Tang;Aziz Gulbeden;Jingyu Zhou;William Strathearn;Tao Yang;Lingkun Chu

  • Affiliations:
  • Ask Jeeves;University of California at Santa Barbara;University of California at Santa Barbara;University of California at Santa Barbara;Ask Jeeves and University of California at Santa Barbara;Ask Jeeves

  • Venue:
  • Proceedings of the 2004 ACM/IEEE conference on Supercomputing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster-based storage systems are popular for data-intensive applications and it is desirable yet challenging to provide incremental expansion and high availability while achieving scalability and strong consistency. This paper presents the design and implementation of a self-organizing storage cluster called Sorrento, which targets data-intensive workload with highly parallel requests and low write-sharing patterns. Sorrento automatically adapts to storage node joins and departures, and the system can be configured and maintained incrementally without interrupting its normal operation. Data location information is distributed across storage nodes using consistent hashing and the location protocol differentiates small and large data objects for access efficiency. It adopts versioning to achieve single-file serializability and replication consistency. In this paper, we present experimental results to demonstrate features and performance of Sorrento using microbenchmarks, application benchmarks, and application trace replay.