HYRISE: a main memory hybrid storage engine

Authors:
Martin Grund;Jens Krüger;Hasso Plattner;Alexander Zeier;Philippe Cudre-Mauroux;Samuel Madden
Affiliations:
Hasso-Plattner-Institute;Hasso-Plattner-Institute;Hasso-Plattner-Institute;Hasso-Plattner-Institute;MIT CSAIL;MIT CSAIL
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 20
Cited 22

Vertical partitioning algorithms for database design

ACM Transactions on Database Systems (TODS)
An integrated model of record segmentation and access path selection for databases

Information Systems
An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases

IEEE Transactions on Software Engineering
Multilevel k-way partitioning scheme for irregular graphs

Journal of Parallel and Distributed Computing
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
A heuristic approach to attribute partitioning

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
The Implementation of POSTGRES

IEEE Transactions on Knowledge and Data Engineering
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Weaving Relations for Cache Performance

Proceedings of the 27th International Conference on Very Large Data Bases
Integrating vertical and horizontal partitioning into automated physical database design

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
C-store: a column-oriented DBMS

VLDB '05 Proceedings of the 31st international conference on Very large data bases
The use of cluster analysis in physical data base design

VLDB '75 Proceedings of the 1st International Conference on Very Large Data Bases
Generic database cost models for hierarchical memory systems

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A case for fractured mirrors

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Data morphing: an adaptive, cache-conscious storage technique

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
P*TIME: highly scalable OLTP DBMS for managing update-intensive stream workload

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The end of an architectural era: (it's time for a complete rewrite)

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
DSM vs. NSM: CPU performance tradeoffs in block-oriented query processing

Proceedings of the 4th international workshop on Data management on new hardware
A common database approach for OLTP and OLAP using an in-memory column database

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

The NOX OLAP query model: from algebra to execution

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
VarDB: high-performance warehouse processing with massive ordering and binary search

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Trojan data layouts: right shoes for a running elephant

Proceedings of the 2nd ACM Symposium on Cloud Computing
Fast updates on read-optimized databases using multi-core CPUs

Proceedings of the VLDB Endowment
dipLODocus[RDF]: short and long-tail RDF analytics for massive webs of data

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Aggregation strategies for columnar in-memory databases in a mixed workload

Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Towards a hybrid row-column database for a cloud-based medical data management system

Proceedings of the 1st International Workshop on Cloud Intelligence
FDB: a query engine for factorised relational databases

Proceedings of the VLDB Endowment
Normalization in a mixed OLTP and OLAP workload scenario

TPCTC'11 Proceedings of the Third TPC Technology conference on Topics in Performance Evaluation, Measurement and Characterization
A storage advisor for hybrid-store databases

Proceedings of the VLDB Endowment
An in-depth analysis of data aggregation cost factors in a columnar in-memory database

Proceedings of the fifteenth international workshop on Data warehousing and OLAP
High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs

Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Near real-time analytics with IBM DB2 analytics accelerator

Proceedings of the 16th International Conference on Extending Database Technology
Hekaton: SQL server's memory-optimized OLTP engine

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
BitWeaving: fast scans for main memory data processing

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Enabling efficient OS paging for main-memory OLTP databases

Proceedings of the Ninth International Workshop on Data Management on New Hardware
Ranking and new database architectures

Proceedings of the 7th International Workshop on Ranking in Databases
Append storage in multi-version databases on flash

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Design and evaluation of storage organizations for read-optimized main memory databases

Proceedings of the VLDB Endowment
A comparison of knives for bread slicing

Proceedings of the VLDB Endowment
Aggregation and ordering in factorised databases

Proceedings of the VLDB Endowment
Optimizing Sample Design for Approximate Query Processing

International Journal of Knowledge-Based Organizations

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe a main memory hybrid database system called HYRISE, which automatically partitions tables into vertical partitions of varying widths depending on how the columns of the table are accessed. For columns accessed as a part of analytical queries (e.g., via sequential scans), narrow partitions perform better, because, when scanning a single column, cache locality is improved if the values of that column are stored contiguously. In contrast, for columns accessed as a part of OLTP-style queries, wider partitions perform better, because such transactions frequently insert, delete, update, or access many of the fields of a row, and co-locating those fields leads to better cache locality. Using a highly accurate model of cache misses, HYRISE is able to predict the performance of different partitionings, and to automatically select the best partitioning using an automated database design algorithm. We show that, on a realistic workload derived from customer applications, HYRISE can achieve a 20% to 400% performance improvement over pure all-column or all-row designs, and that it is both more scalable and produces better designs than previous vertical partitioning approaches for main memory systems.