An in-depth analysis of data aggregation cost factors in a columnar in-memory database

  • Authors:
  • Stephan Müller;Hasso Plattner

  • Affiliations:
  • Hasso-Plattner-Institut, University of Potsdam, Potsdam, Germany;Hasso-Plattner-Institut, University of Potsdam, Potsdam, Germany

  • Venue:
  • Proceedings of the fifteenth international workshop on Data warehousing and OLAP
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Precise prediction of query execution performance is the basis for various database optimization strategies. With columnar in-memory databases, cost modeling changes in two dimensions: First, models for disk-based databases are not well-suited as the new bottleneck is main memory access. Second, the possibility to execute mixed workloads creates new challenges. For transactional and analytical queries with aggregation operations, memory access patterns and thus execution times vary significantly. This paper discusses the influences of data characteristics on aggregation operations and elevates not considered factors by existing cost model approaches. Further, we present benchmarks implemented and executed on a columnar in-memory research database to underline our assumptions.