Approximate Query Processing in Cube Streams

  • Authors:
  • Ming-Jyh Hsieh;Ming-Syan Chen;Philip S. Yu

  • Affiliations:
  • -;IEEE;IEEE

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data cubes have become important components in most data warehouse systems and Decision-Support-Systems. In such systems, users usually pose very complex queries to the Online Analytical Processing (OLAP) system, and systems usually have to deal with a huge amounts of data because of the large dimensionality of the sets; thus approximating query processing has emerged as a viable solution. Specifically, the applications of cube streams handle multidimensional data sets in a continuous manner in contrast to traditional cube approximation. Such an application collects data events for cube streams on-line and generates snapshots with limited resources and keeps the approximated information in a synopsis memory for further analysis. Compared to OLAP applications, applications of cube streams are subject to many more resource constraints on both the processing time and the memory and cannot be dealt with by existing methods due to the limited resources. In this paper, we propose the DAWA algorithm, which is a hybrid algorithm of Dct for Data and the discrete WAvelet transform, to approximate cube streams. Our algorithm combines the advantages of the high compression rate of DWT and the low memory cost of DCT. Consequently, DAWA requires much smaller working buffer and outperforms both DWT-based and DCT-based methods in execution efficiency. Also, it is shown that DAWA provides a good solution for approximate query processing of cube streams with a small working buffer and a short execution time. The optimality of the DAWA algorithm is theoretically proved and empirically demonstrated by our experiments.