Fast updates on read-optimized databases using multi-core CPUs

Authors:
Jens Krueger;Changkyu Kim;Martin Grund;Nadathur Satish;David Schwalb;Jatin Chhugani;Hasso Plattner;Pradeep Dubey;Alexander Zeier
Affiliations:
Hasso-Plattner-Institute, Potsdam, Germany;Parallel Computing Lab, Intel Corporation;Hasso-Plattner-Institute, Potsdam, Germany;Parallel Computing Lab, Intel Corporation;Hasso-Plattner-Institute, Potsdam, Germany;Parallel Computing Lab, Intel Corporation;Hasso-Plattner-Institute, Potsdam, Germany;Parallel Computing Lab, Intel Corporation;Hasso-Plattner-Institute, Potsdam, Germany
Venue:
Proceedings of the VLDB Endowment
Year:
2011

Citing 24
Cited 9

Vertical partitioning algorithms for database design

ACM Transactions on Database Systems (TODS)
Data parallel algorithms

Communications of the ACM - Special issue on parallelism
A Benchmark Parallel Sort for Shared Memory Multiprocessors

IEEE Transactions on Computers
Join processing in relational databases

ACM Computing Surveys (CSUR)
“One size fits all” database architectures do not work for DSS

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Making B+- trees cache conscious in main memory

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The data locality of work stealing

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Integrating vertical and horizontal partitioning into automated physical database design

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Generic database cost models for hierarchical memory systems

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A case for fractured mirrors

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Sybase IQ multiplex - designed for analytics

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Adaptive aggregation on chip multiprocessors

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient implementation of sorting on multi-core SIMD CPU architecture

Proceedings of the VLDB Endowment
A common database approach for OLTP and OLAP using an in-memory column database

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Proceedings of the VLDB Endowment
SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units

Proceedings of the VLDB Endowment
FAST: fast architecture sensitive tree search on modern CPUs and GPUs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Positional update handling in column stores

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Enterprise Application-Specific Data Management

EDOC '10 Proceedings of the 2010 14th IEEE International Enterprise Distributed Object Computing Conference
HYRISE: a main memory hybrid storage engine

Proceedings of the VLDB Endowment
Online reorganization in read optimized MMDBS

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Optimizing write performance for read optimized databases

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II

Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Compacting transactional data in hybrid OLTP&OLAP databases

Proceedings of the VLDB Endowment
Database analytics acceleration using FPGAs

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Automatic selection of processing units for coprocessing in databases

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
RTP: robust tenant placement for elastic in-memory database clusters

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
BitWeaving: fast scans for main memory data processing

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Navigating big data with high-throughput, energy-efficient data partitioning

Proceedings of the 40th Annual International Symposium on Computer Architecture
Read optimisations for append storage on flash

Proceedings of the 17th International Database Engineering & Applications Symposium
Append storage in multi-version databases on flash

BNCOD'13 Proceedings of the 29th British National conference on Big Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process introduces significant overheads and unacceptable downtimes in update intensive systems, aspiring to combine transactional and analytical workloads into one system. In the first part of the paper, we report data analyses of 12 SAP Business Suite customer systems. In the second half, we present an optimized merge process reducing the merge overhead of current systems by a factor of 30. Our linear-time merge algorithm exploits the underlying high compute and bandwidth resources of modern multi-core CPUs with architecture-aware optimizations and efficient parallelization. This enables compressed in-memory column stores to handle the transactional update rate required by enterprise applications, while keeping properties of read-optimized databases for analytic-style queries.