Optimized incremental ETL jobs for maintaining data warehouses

  • Authors:
  • Andreas Behrend;Thomas Jörg

  • Affiliations:
  • University of Bonn, Germany;University of Kaiserslautern, Germany

  • Venue:
  • Proceedings of the Fourteenth International Database Engineering & Applications Symposium
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

ETL jobs are used to integrate data from distributed and heterogeneous sources into a data warehouse. A well-known challenge in this context is the development of incremental ETL jobs for efficiently maintaining warehouse data in the presence of source data updates. In this paper, we present a new transformation-based approach to automatically derive incremental ETL jobs. To this end, we consider a simplification of the underlying update propagation process based on the computation of so-called safe updates instead of true ones. Additionally, we identify the limitations of already proposed incremental solutions, which are cured by employing Magic Sets leading to dramatic performance gains.