GEM: requirement-driven generation of ETL and multidimensional conceptual designs

  • Authors:
  • Oscar Romero;Alkis Simitsis;Alberto Abelló

  • Affiliations:
  • Universitat Politècnica de Catalunya, BarcelonaTech, Barcelona, Spain;HP Labs, Palo Alto, CA;Universitat Politècnica de Catalunya, BarcelonaTech, Barcelona, Spain

  • Venue:
  • DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

At the early stages of a data warehouse design project, the main objective is to collect the business requirements and needs, and translate them into an appropriate conceptual, multidimensional design. Typically, this task is performed manually, through a series of interviews involving two different parties: the business analysts and technical designers. Producing an appropriate conceptual design is an error-prone task that undergoes several rounds of reconciliation and redesigning, until the business needs are satisfied. It is of great importance for the business of an enterprise to facilitate and automate such a process. The goal of our research is to provide designers with a semi-automatic means for producing conceptual multidimensional designs and also, conceptual representation of the extract-transform-load (ETL) processes that orchestrate the data flow from the operational sources to the data warehouse constructs. In particular, we describe a method that combines information about the data sources along with the business requirements, for validating and completing -if necessary- these requirements, producing a multidimensional design, and identifying the ETL operations needed. We present our method in terms of the TPC-DS benchmark and show its applicability and usefulness.