E-ETL: framework for managing evolving etl processes

  • Authors:
  • Artur Wojciechowski

  • Affiliations:
  • Poznań University of Technology, Poznań, Poland

  • Venue:
  • Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

External data sources (EDSs) being integrated in a data warehouse (DW) frequently change their data structures (schemas). As a consequence, in many cases, an already deployed ETL workflow executes with errors. Since structural changes of EDSs are frequent, an automatic reparation of an ETL workflow after such changes is of a high importance. In this paper we present a framework for handling the evolution of an ETL layer. To this end, structural changes are monitored and stored in a Metabase. An erroneous execution of an ETL workflow causes a reparation of the ETL activities that interact with the changed EDS, so that the repaired activities can work on the changed EDS schema. The reparation of the ETL activities is guided by several customizable reparation algorithms. The proposed framework was developed as a module external to an ETL engine, accessing the engine by means of API. The innovation of this framework are algorithms for semi-automatic reparation of an ETL workflow.