ETL queues for active data warehousing

  • Authors:
  • Alexandros Karakasidis;Panos Vassiliadis;Evaggelia Pitoura

  • Affiliations:
  • Univ. of Ioannina, Ioannina, Hellas;Univ. of Ioannina, Ioannina, Hellas;Univ. of Ioannina, Ioannina, Hellas

  • Venue:
  • Proceedings of the 2nd international workshop on Information quality in information systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditionally, the refreshment of data warehouses has been performed in an off-line fashion. Active Data Warehousing refers to a new trend where data warehouses are updated as frequently as possible, to accommodate the high demands of users for fresh data. In this paper, we propose a framework for the implementation of active data warehousing, with the following goals: (a) minimal changes in the software configuration of the source, (b) minimal overhead for the source due to the active nature of data propagation, (c) the possibility of smoothly regulating the overall configuration of the environment in a principled way. In our framework, we have implemented ETL activities over queue networks and employ queue theory for the prediction of the performance and the tuning of the operation of the overall refreshment process. Due to the performance overheads incurred, we explore different architectural choices for this task and discuss the issues that arise for each of them.