Dynamic workload management for very large data warehouses: juggling feathers and bowling balls

  • Authors:
  • Stefan Krompass;Harumi Kuno;Umeshwar Dayal;Alfons Kemper

  • Affiliations:
  • TU München, Garching, Germany;HP Labs, Palo Alto, CA;HP Labs, Palo Alto, CA;TU München, Garching, Germany

  • Venue:
  • VLDB '07 Proceedings of the 33rd international conference on Very large data bases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Workload management for business intelligence (BI) queries poses different challenges than those addressed in the online transaction processing (OLTP) context. The fundamental problem is that the execution times of BI queries can range from milliseconds to hours, and it is difficult to estimate these times accurately. Key challenges raised by this problem are how to identify queries that are not performing properly and what to do about them. We propose here a workload management system for controlling the execution of individual queries based on realistic customer service level objectives. In order to validate our proposal, we have implemented an experimental system that includes a dynamic execution controller that leverages fuzzy logic. We present results from a number of experiments that we ran using workloads based on actual industrial workloads and customer objectives that we gathered by interviewing industry practitioners. Our experiments show that even a handful of moderately mis-behaving problem queries can have a significant impact on a workload consisting of thousands of queries. We were surprised when our experiments also demonstrated that false positives -- incorrectly identifying a normal query as a problem -- can also have significant consequences. For those reasons, it is very important that an execution controller be as accurate as possible -- avoiding both false positives and false negatives. Our experiments also validate that our execution controller can markedly improve the execution of a workload that includes problem queries.