Modular data storage with Anvil

Authors:
Mike Mammarella;Shant Hovsepian;Eddie Kohler
Affiliations:
UCLA, Los Angeles, CA, USA;UCLA, Los Angeles, CA, USA;UCLA/Meraki, Los Angeles, CA, USA
Venue:
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Year:
2009

Citing 20
Cited 8

A data management extension architecture

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
GENESIS: An Extensible Database Management System

IEEE Transactions on Software Engineering
The POSTGRES next generation database management system

Communications of the ACM
The design and implementation of a log-structured file system

ACM Transactions on Computer Systems (TOCS)
The log-structured merge-tree (LSM-tree)

Acta Informatica
Free transactions with Rio Vista

Proceedings of the sixteenth ACM symposium on Operating systems principles
The implementation and performance of compressed databases

ACM SIGMOD Record
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Versatility and Unix semantics in namespace unification

ACM Transactions on Storage (TOS)
Integrating compression and execution in column-oriented database systems

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Performance tradeoffs in read-optimized databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Generalized file system dependencies

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Rethink the sync

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Stasis: flexible transactional storage

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Bigtable: a distributed storage system for structured data

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
The end of an architectural era: (it's time for a complete rewrite)

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
OLTP through the looking glass, and what we found there

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient online index construction for text databases

ACM Transactions on Database Systems (TODS)
Rose: compressed, log-structured replication

Proceedings of the VLDB Endowment

An efficient multi-tier tablet server storage architecture

Proceedings of the 2nd ACM Symposium on Cloud Computing
SILT: a memory-efficient, high-performance key-value store

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Multi-structured redundancy

HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems
Gnothi: separating data and metadata for efficient and available storage replication

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
MyCassandra: a cloud storage supporting both read heavy and write heavy workloads

Proceedings of the 5th Annual International Systems and Storage Conference
Flex-KV: enabling high-performance and flexible KV systems

Proceedings of the 2012 workshop on Management of big data systems
Improving Bandwidth Efficiency for Consistent Multistream Storage

ACM Transactions on Storage (TOS)
Building workload-independent storage with VT-trees

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Databases have achieved orders-of-magnitude performance improvements by changing the layout of stored data -- for instance, by arranging data in columns or compressing it before storage. These improvements have been implemented in monolithic new engines, however, making it difficult to experiment with feature combinations or extensions. We present Anvil, a modular and extensible toolkit for building database back ends. Anvil's storage modules, called dTables, have much finer granularity than prior work. For example, some dTables specialize in writing data, while others provide optimized read-only formats. This specialization makes both kinds of dTable simple to write and understand. Unifying dTables implement more comprehensive functionality by layering over other dTables -- for instance, building a read/write store from read-only tables and a writable journal, or building a general-purpose store from optimized special-purpose stores. The dTable design leads to a flexible system powerful enough to implement many database storage layouts. Our prototype implementation of Anvil performs up to 5.5 times faster than an existing B-tree-based database back end on conventional workloads, and can easily be customized for further gains on specific data and workloads.