A grain preservation translation algorithm: From ER diagram to multidimensional model

  • Authors:
  • Yen-Ting Chen;Ping-Yu Hsu

  • Affiliations:
  • Department of Business Administration, National Central University, No. 300, Jhongda Road, Jhongli City, Taoyuan County 32001, Taiwan, ROC and Department of Information Management, Lunghwa Univers ...;Department of Business Administration, National Central University, No. 300, Jhongda Road, Jhongli City, Taoyuan County 32001, Taiwan, ROC

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2007

Quantified Score

Hi-index 0.07

Visualization

Abstract

Many IT practitioners and researchers advocate that data models of data warehouses should incorporate the sources of their data in order to achieve maximum efficiency. As the source data are probably derived from system designed with ER diagrams, a great deal of research has been devoted to the design of methodologies for building multidimensional models based on source ER diagrams. However, to the best of our knowledge, no algorithm has been proposed that can systematically translate an entire ER diagram into a multidimensional model with hierarchical snowflake structures. In this paper, we propose an algorithm that achieves the above goal because it incorporates two features, namely, grain preservation and the minimal distance from each dimension table to the fact table. The grain preservation feature guarantees that the translated multidimensional model will maintain cohesive granularity among the entities. Meanwhile, the minimal distance feature guarantees that if an entity can be connected to the fact table in the multidimensional model by more than one path, the path with the smallest number of hops will always be chosen. The first feature is derived by translating ambiguous relationships between entities into weighting factors stored in bridge tables and enhancing fact tables with unique primary keys. The second feature results from including a revised shortest path algorithm in the translating algorithm, with the distance being calculated as the number of relationships required between entities. A prototype system based on the methodology is also developed, and snapshots of the screens used for the system's execution are presented.