Optimizing write performance for read optimized databases

Authors:
Jens Krueger;Martin Grund;Christian Tinnefeld;Hasso Plattner;Alexander Zeier;Franz Faerber
Affiliations:
Hasso–Plattner–Institut, Potsdam, Germany;Hasso–Plattner–Institut, Potsdam, Germany;Hasso–Plattner–Institut, Potsdam, Germany;Hasso–Plattner–Institut, Potsdam, Germany;Hasso–Plattner–Institut, Potsdam, Germany;SAP AG, Walldorf
Venue:
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Year:
2010

Citing 20
Cited 4

“One size fits all” database architectures do not work for DSS

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Differential files: their application to the maintenance of large databases

ACM Transactions on Database Systems (TODS)
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The implementation and performance of compressed databases

ACM SIGMOD Record
Query optimization in compressed database systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
File structure design to facilitate on-line instantaneous updating

SIGMOD '75 Proceedings of the 1975 ACM SIGMOD international conference on Management of data
Multiclass Query Scheduling in Real-Time Database Systems

IEEE Transactions on Knowledge and Data Engineering
Teaching an OLTP Database Kernel Advanced Data Warehousing Techniques

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Towards Automated Performance Tuning for Complex Workloads

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Integrating compression and execution in column-oriented database systems

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Performance tradeoffs in read-optimized databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data mining with the SAP NetWeaver BI accelerator

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A case for fractured mirrors

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Read-optimized databases, in depth

Proceedings of the VLDB Endowment
Query execution in column-oriented database systems

Query execution in column-oriented database systems
A common database approach for OLTP and OLAP using an in-memory column database

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units

Proceedings of the VLDB Endowment

Fast updates on read-optimized databases using multi-core CPUs

Proceedings of the VLDB Endowment
SAP HANA database: data management for modern business applications

ACM SIGMOD Record
Compacting transactional data in hybrid OLTP&OLAP databases

Proceedings of the VLDB Endowment
Efficient logging for enterprise workloads on column-oriented in-memory databases

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Compression in column-oriented databases has been proven to offer both performance enhancements and reductions in storage consumption. This is especially true for read access as compressed data can directly be processed for query execution.Nevertheless, compression happens to be disadvantageous when it comes to write access due to unavoidable re-compression: write-access requires significantly more data to be read than involved in the particular operation, more tuples may have to be modified depending on the compression algorithm, and table-level locks have to be acquired instead of row-level locks as long as no second version of the data is stored. As an effect the duration of a single modification — both insert and update — limits both throughput and response time significantly. In this paper, we propose to use an additional write-optimized buffer to maintain the delta that in conjunction with the compressed main store represents the current state of the data. This buffer facilitates an uncompressed, column-oriented data structure. To address the mentioned disadvantages of data compression, we trade write-performance for query-performance and memory consumption by using the buffer as an intermediate storage for several modifications which are then populated as a bulk in a merge operation. Hereby, the overhead created by one single re-compression is shared among all recent modifications. We evaluated our implementation inside SAP’s in memory column store. We then analyze the different parameters influencing the merge process, and make a complexity analysis. Finally, we show optimizations regarding resource consumption and merge duration.