Benchmarking ETL Workflows

  • Authors:
  • Alkis Simitsis;Panos Vassiliadis;Umeshwar Dayal;Anastasios Karagiannis;Vasiliki Tziovara

  • Affiliations:
  • HP Labs, Palo Alto, USA;Dept. of Computer Science, University of Ioannina, Ioannina;HP Labs, Palo Alto, USA;Dept. of Computer Science, University of Ioannina, Ioannina;Dept. of Computer Science, University of Ioannina, Ioannina

  • Venue:
  • Performance Evaluation and Benchmarking
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extraction---Transform---Load (ETL) processes comprise complex data workflows, which are responsible for the maintenance of a Data Warehouse. A plethora of ETL tools is currently available constituting a multi-million dollar market. Each ETL tool uses its own technique for the design and implementation of an ETL workflow, making the task of assessing ETL tools extremely difficult. In this paper, we identify common characteristics of ETL workflows in an effort of proposing a unified evaluation method for ETL. We also identify the main points of interest in designing, implementing, and maintaining ETL workflows. Finally, we propose a principled organization of test suites based on the TPC-H schema for the problem of experimenting with ETL workflows.