An efficient multi-tier tablet server storage architecture

  • Authors:
  • Richard P. Spillane;Pradeep J. Shetty;Erez Zadok;Sagar Dixit;Shrikar Archak

  • Affiliations:
  • Stony Brook University;Stony Brook University;Stony Brook University;Stony Brook University;Stony Brook University

  • Venue:
  • Proceedings of the 2nd ACM Symposium on Cloud Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed, structured data stores such as Big Table, HBase, and Cassandra use a cluster of machines, each running a database-like software system called the Tablet Server Storage Layer or TSSL. A TSSL's performance on each node directly impacts the performance of the entire cluster. In this paper we introduce an efficient, scalable, multi-tier storage architecture for tablet servers. Our system can use any layered mix of storage devices such as Flash SSDs and magnetic disks. Our experiments show that by using a mix of technologies, performance for certain workloads can be improved beyond configurations using strictly two-tier approaches with one type of storage technology. We utilized, adapted, and integrated cache-oblivious algorithms and data structures, as well as Bloom filters, to improve scalability significantly. We also support versatile, efficient transactional semantics. We analyzed and evaluated our system against the storage layers of Cassandra and Hadoop HBase. We used wide range of workloads and configurations from read- to write-optimized, as well as different input sizes. We found that our system is 3--10× faster than existing systems; that using proper data structures, algorithms, and techniques is critical for scalability, especially on modern Flash SSDs; and that one can fully support versatile transactions without sacrificing performance.