Pruning attribute values from data cubes with diamond dicing

  • Authors:
  • Hazel Webb;Owen Kaser;Daniel Lemire

  • Affiliations:
  • University of New Brunswick;University of New Brunswick;Université du Québec à Montréal

  • Venue:
  • IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data stored in a data warehouse are inherently multidimensional, unlike most data-pruning techniques (such as iceberg and top-k queries). However, analysts need to issue multidimensional queries. For example, an analyst may need to select not just the most profitable stores or---separately---the most profitable products, but simultaneous sets of stores and products fulfilling some profitability constraints. To fill this need, we propose a new operator, the diamond dice. Because of the interaction between dimensions, the computation of diamonds is challenging. We present the first diamond-dicing experiments on large data sets. Our external memory algorithm avoids potentially expensive random accesses. Experiments show that we can compute diamond cubes over fact tables containing 100 million facts and 500,000 distinct attribute values in less than an hour using a single-core PC.