Aggregation strategies for columnar in-memory databases in a mixed workload

  • Authors:
  • Stephan Müller;Hasso Plattner

  • Affiliations:
  • Hasso-Plattner-Institut, Potsdam, Germany;Hasso-Plattner-Institut, Potsdam, Germany

  • Venue:
  • Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recent trend towards analytics on operational data has led to an approach of reunifying online transactional processing and online analytical processing in one single database. The advent of columnar in-memory databases makes this viable and feasible as expensive join and aggregation operations can be performed with superior performance compared to traditional row-oriented databases. This has led to the radical proposal of abandoning materialized aggregate tables and calculate all aggregations on the fly. This PhD research project investigates factors that have an influence on the aggregation performance in columnar in-memory databases. Based on the identified factors, we aim to evaluate different cost model approaches, that are subject to validation with real-life data of large industry customers and their mixed workloads. The goal of this project is the design and implementation of an aggregation engine that decides, based on the data and application characteristics, the historic and current workload and other cost-relevant factors, whether it is beneficial with regards to query performance, but also considering aggregation view maintenance costs, to materialize an aggregate or not.