Requirement-based data cube schema design

Authors:
David W. Cheung;Bo Zhou;Ben Kao;Hongjun Lu;Tak Wah Lam;Hing Fung Ting
Affiliations:
Department of Computer Science and Information Systems, The University of Hong Kong, Hong Kong;Department of Computer Science and Engineering, Zhejiang University, Hangzhou, China;Department of Computer Science and Information Systems, The University of Hong Kong, Hong Kong;Department of Computer Science, The Hong Kong University of Science and Technology, Hong Kong;Department of Computer Science and Information Systems, The University of Hong Kong, Hong Kong;Department of Computer Science and Information Systems, The University of Hong Kong, Hong Kong
Venue:
Proceedings of the eighth international conference on Information and knowledge management
Year:
1999

Citing 9
Cited 4

Multi-table joins through bitmapped join indices

ACM SIGMOD Record
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Improved query performance with variant indexes

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Materialized View Selection for Multidimensional Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Aggregate-Query Processing in Data Warehousing Environments

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Constructing OLAP cubes based on queries

Proceedings of the 4th ACM international workshop on Data warehousing and OLAP
An Optimization Problem in Data Cube System Design

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Optimization in Data Cube System Design

Journal of Intelligent Information Systems
A multidimensional data model for TPC-DS benchmarking

Proceedings of the 5th Asia-Pacific Symposium on Internetware

Quantified Score

Hi-index	0.00

Visualization

Abstract

On-line analytical processing (OLAP) requires efficient processing of complex decision support queries over very large databases. It is well accepted that pre-computed data cubes can help reduce the response time of such queries dramatically. A very important design issue of an efficient OLAP system is therefore the choice of the right data cubes to materialize. We call this problem the data cube schema design problem. In this paper we show that the problem of finding an optimal data cube schema for an OLAP system with limited memory is NP-hard. As a more computationally efficient alternative, we propose a greedy approximation algorithm cMP and its variants. Algorithm cMP consists of two phases. In the first phase, an initial schema consisting of all the cubes required to efficiently answer the user queries is formed. In the second phase, cubes in the initial schema are selectively merged to satisfy the memory constraint. We show that cMP is very effective in pruning the search space for an optimal schema. This leads to a highly efficient algorithm. We report the efficiency and the effectiveness of cMP via an empirical study using the TPC-D benchmark. Our results show that the data cube schemas generated by cMP enable very efficient OLAP query processing.