CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees

  • Authors:
  • Joachim Hammer;Lixin Fu

  • Affiliations:
  • Computer & Information Science & Eng., University of Florida, Gainesville, FL 32611-6120, USA. jhammer@cise.ufl.edu;Division of Computer Science, University of North Carolina, Greensboro, Greensboro, NC 27402-6170, USA. lfu@uncg.edu

  • Venue:
  • Distributed and Parallel Databases
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select and materialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems.