A generic and customizable framework for the design of ETL scenarios

  • Authors:
  • Panos Vassiliadis;Alkis Simitsis;Panos Georgantas;Manolis Terrovitis;Spiros Skiadopoulos

  • Affiliations:
  • Department of Computer Science, University of Ioannina, Ioannina, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece

  • Venue:
  • Information Systems - Special issue: The 15th international conference on advanced information systems engineering (CAiSE 2003)
  • Year:
  • 2005

Quantified Score

Hi-index 0.09

Visualization

Abstract

Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. In this paper, we delve into the logical design of ETL scenarios and provide a generic and customizable framework in order to support the DW designer in his task. First, we present a metamodel particularly customized for the definition of ETL activities. We follow a workflow-like approach, where the output of a certain activity can either be stored persistently or passed to a subsequent activity. Also, we employ a declarative database programming language, LDL, to define the semantics of each activity. The metamodel is generic enough to capture any possible ETL activity. Nevertheless, in the pursuit of higher reusability and flexibility, we specialize the set of our generic metamodel constructs with a palette of frequently used ETL activities, which we call templates. Moreover, in order to achieve a uniform extensibility mechanism for this library of built-ins, we have to deal with specific language issues. Therefore, we also discuss the mechanics of template instantiation to concrete activities. The design concepts that we introduce have been implemented in a tool, ARKTOS II, which is also presented.