LazyBase: trading freshness for performance in a scalable database

  • Authors:
  • James Cipar;Greg Ganger;Kimberly Keeton;Charles B. Morrey, III;Craig A.N. Soules;Alistair Veitch

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA

  • Venue:
  • Proceedings of the 7th ACM european conference on Computer Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The LazyBase scalable database system is specialized for the growing class of data analysis applications that extract knowledge from large, rapidly changing data sets. It provides the scalability of popular NoSQL systems without the query-time complexity associated with their eventual consistency models, offering a clear consistency model and explicit per-query control over the trade-off between latency and result freshness. With an architecture designed around batching and pipelining of updates, LazyBase simultaneously ingests atomic batches of updates at a very high throughput and offers quick read queries to a stale-but-consistent version of the data. Although slightly stale results are sufficient for many analysis queries, fully up-to-date results can be obtained when necessary by also scanning updates still in the pipeline. Compared to the Cassandra NoSQL system, LazyBase provides 4X--5X faster update throughput and 4X faster read query throughput for range queries while remaining competitive for point queries. We demonstrate LazyBase's tradeoff between query latency and result freshness as well as the benefits of its consistency model. We also demonstrate specific cases where Cassandra's consistency model is weaker than LazyBase's.