Dynamic Materialization of Query Views for Data Warehouse Workloads

Authors:
Thomas Phan;Wen-Syan Li
Affiliations:
Yahoo!, Inc., IBM Research. thomas.phan@acm.org;IBM Almaden Research Center. wsl@us.ibm.com
Venue:
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Year:
2008

Citing 0
Cited 6

A request-routing framework for SOA-based enterprise computing

Proceedings of the VLDB Endowment
An architecture for recycling intermediates in a column-store

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
High Performance Analytics with the R3-Cache

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
An architecture for recycling intermediates in a column-store

ACM Transactions on Database Systems (TODS)
Queen-bee: query interaction-aware for buffer allocation and scheduling problem

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Dynamic View Management System for Query Prediction to View Materialization

International Journal of Data Warehousing and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

A materialized view, or Materialized Query Table (MQT), is an auxiliary table with precomputed data that can be used to significantly improve the performance of a database query. Previous research efforts have focused onfinding the best candidate MQT set, with a common static heuristic being to greedily pre-materialize the MQTs prior to executing the workload. While this approach is sound when the size of the MQT set on disk is small, it will not be able to pre-materialize all MQTs and indexes when faced with real-world disk limits and view maintenance costs, and thus a static heuristic will fail to exploit the potentially large benefits of those MQTs not selected for materialization. In this paper we present an automated, dynamic MQT management scheme that materializes views and creates indexes in an on-demand fashion as a workload executes and manages them with an LRU cache. In order to maximize the benefit of executing queries with MQTs, the scheme makes an adaptive tradeoff between the MQT materializations, the base table accesses, and the benefit of MQT hits in the cache. To find the workload permutation that produces the overall highest net benefit, we use a genetic algorithm to search the N! solution space, and to avoid materializing seldom-used MQTs, we prune the set of MQT candidates.