A user-driven data warehouse evolution approach for concurrent personalized analysis needs

  • Authors:
  • Fadila Bentayeb;Cé/cile Favre;Omar Boussaid

  • Affiliations:
  • (Correspd. Tel.: +33 478 772 682/ Fax: +33 478 772 375/ E-mail: bentayeb@eric.univ-lyon2.fr) ERIC Laboratory, University of Lyon 2, Bâ/t.L-5AV. Pierre Mendè/s-France, 69676 Bron Cedex, Fra ...;ERIC Laboratory, University of Lyon 2, Bâ/t.L-5AV. Pierre Mendè/s-France, 69676 Bron Cedex, France;ERIC Laboratory, University of Lyon 2, Bâ/t.L-5AV. Pierre Mendè/s-France, 69676 Bron Cedex, France

  • Venue:
  • Integrated Computer-Aided Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data warehouses store aggregated data issued from different sources to meet users' analysis needs for decision support. The nature of the work of users implies that their requirements are often changing and do not reach a final state. Therefore, a data warehouse cannot be designed in one step, usually it evolves over the time. In this paper, we propose a user-driven approach that enables a data warehouse schema update. It consists in integrating the users' knowledge in the data warehouse modeling to allow new analysis possibilities. More precisely, we consider the specific users' knowledge, which defines new aggregated data, under the form of "if-then" rules that we call aggregation rules. These rules are used to dynamically create new granularity levels in dimension hierarchies, following an automatic and concurrent way. Our approach is composed of four phases: (1) users' knowledge acquisition, (2) knowledge integration, (3) data warehouse schema update, and (4) on-line analysis. To support our approach, we define a Rule-based Data Warehouse (R-DW) model composed of two parts: one "fixed" part and one "evolving" part. The fixed part corresponds to the initial data warehouse schema, whose purpose is to provide an answer to global analysis needs. The evolving part is defined by means of aggregation rules, which allow personalized analyses. To validate our approach, we developed a prototype called WEDriK (data Warehouse Evolution Driven by Knowledge), in which the R-DW model is implemented within the Oracle 10g DBMS. We also present how to achieve our approach by proposing a model dedicated to the management of the data warehouse schema evolution and the updates' algorithms. Furthermore, we applied our approach on banking data of the French bank LCL-Le Crédit Lyonnais and we illustrate our purpose with the LCL case study.