Metadata management in a multiversion data warehouse

  • Authors:
  • Robert Wrembel;Bartosz Bębel

  • Affiliations:
  • Institute of Computing Science, Poznan University of Technology, Poznan, Poland;Institute of Computing Science, Poznan University of Technology, Poznan, Poland

  • Venue:
  • Journal on data semantics VIII
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A data warehouse (DW) is a database that integrates data from external data sources (EDSs) for the purpose of advanced analysis. EDSs are production systems that often change not only their contents but also their structures. The evolution of EDSs has to be reflected in a DW that integrates the sources. Traditional DW systems offer a limited support for the evolution of their structures. Our solution to this problem is based on a multiversion data warehouse (MVDW). Such a DW is composed of the sequence of persistent versions, each of which describes a schema and data within a given time period. The management of the MVDW requires a metadata model that is much more complex than in traditional data warehouses. In our approach and prototype MVDW system, the metadata model contains data structures that support: (1) monitoring EDSs with respect to content and structural changes, (2) automatic generation of processes monitoring EDSs, (3) applying discovered EDS changes to a selected DW version, (4) describing the structure of every DW version, (5) querying multiple DW versions at the same time and presenting the results coming from multiple versions.