QoX-driven ETL design: reducing the cost of ETL consulting engagements

  • Authors:
  • Alkis Simitsis;Kevin Wilkinson;Malu Castellanos;Umeshwar Dayal

  • Affiliations:
  • HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA;HP Labs, Palo Alto, CA, USA

  • Venue:
  • Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

As business intelligence becomes increasingly essential for organizations and as it evolves from strategic to operational, the complexity of Extract-Transform-Load (ETL) processes grows. In consequence, ETL engagements have become very time consuming, labor intensive, and costly. At the same time, additional requirements besides functionality and performance need to be considered in the design of ETL processes. In particular, the design quality needs to be determined by an intricate combination of different metrics like reliability, maintenance, scalability, and others. Unfortunately, there are no methodologies, modeling languages or tools to support ETL design in a systematic, formal way for achieving these quality requirements. The current practice handles them with ad-hoc approaches only based on designers' experience. This results in either poor designs that do not meet the quality objectives or costly engagements that require several iterations to meet them. A fundamental shift that uses automation in the ETL design task is the only way to reduce the cost of these engagements while obtaining optimal designs. Towards this goal, we present a novel approach to ETL design that incorporates a suite of quality metrics, termed QoX, at all stages of the design process. We discuss the challenges and tradeoffs among QoX metrics and illustrate their impact on alternative designs.