Rose: compressed, log-structured replication

Authors:
Russell Sears;Mark Callaghan;Eric Brewer
Affiliations:
UC Berkeley;Google;UC Berkeley
Venue:
Proceedings of the VLDB Endowment
Year:
2008

Citing 13
Cited 8

The log-structured merge-tree (LSM-tree)

Acta Informatica
Concurrency Control in Distributed Database Systems

ACM Computing Surveys (CSUR)
The implementation and performance of compressed databases

ACM SIGMOD Record
Design, Implementation, and Performance of the LHAM Log-Structured History Data Access Method

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Online B-tree merging

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
B-tree indexes for high update rates

ACM SIGMOD Record
Super-Scalar RAM-CPU Cache Compression

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating compression and execution in column-oriented database systems

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
How to wring a table dry: entropy compression of relations and querying of compressed relations

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
How to barter bits for chronons: compression and bandwidth trade offs for database scans

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Stasis: flexible transactional storage

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
The partitioned exponential file for database storage management

The VLDB Journal — The International Journal on Very Large Data Bases

Story book: an efficient extensible provenance framework

TAPP'09 First workshop on on Theory and practice of provenance
Modular data storage with Anvil

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The RDF-3X engine for scalable management of RDF data

The VLDB Journal — The International Journal on Very Large Data Bases
Benchmarking cloud serving systems with YCSB

Proceedings of the 1st ACM symposium on Cloud computing
An efficient multi-tier tablet server storage architecture

Proceedings of the 2nd ACM Symposium on Cloud Computing
A survey of B-tree logging and recovery techniques

ACM Transactions on Database Systems (TODS)
bLSM: a general purpose log structured merge tree

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
MyCassandra: a cloud storage supporting both read heavy and write heavy workloads

Proceedings of the 5th Annual International Systems and Storage Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Rose is a database storage engine for high-throughput replication. It targets seek-limited, write-intensive transaction processing workloads that perform near real-time decision support and analytical processing queries. Rose uses log structured merge (LSM) trees to create full database replicas using purely sequential I/O, allowing it to provide orders of magnitude more write throughput than B-tree based replicas. Also, LSM-trees cannot become fragmented and provide fast, predictable index scans. Rose's write performance relies on replicas' ability to perform writes without looking up old values. LSM-tree lookups have performance comparable to B-tree lookups. If Rose read each value that it updated then its write throughput would also be comparable to a B-tree. Although we target replication, Rose provides high write throughput to any application that updates tuples without reading existing data, such as append-only, streaming and versioning databases. We introduce a page compression format that takes advantage of LSM-tree's sequential, sorted data layout. It increases replication throughput by reducing sequential I/O, and enables efficient tree lookups by supporting small page sizes and doubling as an index of the values it stores. Any scheme that can compress data in a single pass and provide random access to compressed values could be used by Rose. Replication environments have multiple readers but only one writer. This allows Rose to provide atomicity, consistency and isolation to concurrent transactions without resorting to rollback, blocking index requests or interfering with maintenance tasks. Rose avoids random I/O during replication and scans, leaving more I/O capacity for queries than existing systems, and providing scalable, real-time replication of seek-bound workloads. Analytical models and experiments show that Rose provides orders of magnitude greater replication bandwidth over larger databases than conventional techniques.