A model-driven framework for ETL process development

  • Authors:
  • Zineb El Akkaoui;Esteban Zimànyi;Jose-Norberto Mazón;Juan Trujillo

  • Affiliations:
  • Université Libre de Bruxelles, Brussels, Belgium;Université Libre de Bruxelles, Brussels, Belgium;University of Alicante, Alicante, Spain;University of Alicante, Alicante, Spain

  • Venue:
  • Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

ETL processes are the backbone component of a data warehouse, since they supply the data warehouse with the necessary integrated and reconciled data from heterogeneous and distributed data sources. However, the ETL process development, and particularly its design phase, is still perceived as a time-consuming task. This is mainly due to the fact that ETL processes are typically designed by considering a specific technology from the very beginning of the development process. Thus, it is difficult to share and reuse methodologies and best practices among projects implemented with different technologies. To the best of our knowledge, no attempt has been yet dedicated to harmonize the ETL process development by proposing a common and integrated development strategy. To overcome this drawback, in this paper, a framework for model-driven development of ETL processes is introduced. The benefit of our framework is twofold: (i) using vendor-independent models for a unified design of ETL processes, based on the expressive and well-known standard for modeling business processes, the Business Process Modeling Notation (BPMN), and (ii) automatically transforming these models into the required vendor-specific code to execute the ETL process into a concrete platform.