Why is the snowflake schema a good data warehouse design?

Authors:
Mark Levene;George Loizou
Affiliations:
School of Computer Science and Information Systems, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK;School of Computer Science and Information Systems, Birkbeck College, University of London, Malet Street, London WC1E 7HX, UK
Venue:
Information Systems
Year:
2003

Citing 24
Cited 22

Statistical treatment of the information content of a database

Information Systems
Independent and separable database schemes

SIAM Journal on Computing
An Infornation-Theoretic Analysis of Relational Databases Part I: Data Dependencies and Information Metric

IEEE Transactions on Software Engineering
Existence of extensions and product extensions for discrete probability distributions

Discrete Mathematics
Evaluation of queries in independent database schemes

Journal of the ACM (JACM)
Independent database schemes under functional and inclusion dependencies

Acta Informatica
The design of relational databases

The design of relational databases
Multi-table joins through bitmapped join indices

ACM SIGMOD Record
Building the data warehouse (2nd ed.)

Building the data warehouse (2nd ed.)
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
A characterization of globally consistent databases and their correct access paths

ACM Transactions on Database Systems (TODS)
Computational problems related to the design of normal form relational schemas

ACM Transactions on Database Systems (TODS)
Testing satisfaction of functional dependencies

Journal of the ACM (JACM)
On the Desirability of Acyclic Database Schemes

Journal of the ACM (JACM)
Degrees of acyclicity for hypergraphs and relational database schemes

Journal of the ACM (JACM)
The data webhouse toolkit: building the web-enabled data warehouse

The data webhouse toolkit: building the web-enabled data warehouse
Guaranteeing no interaction between functional dependencies and tree-like inclusion dependencies

Theoretical Computer Science
The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom

The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom
A Guided Tour of Relational Databases and Beyond

A Guided Tour of Relational Databases and Beyond
Towards a sound view integration methodology

PODS '83 Proceedings of the 2nd ACM SIGACT-SIGMOD symposium on Principles of database systems
Justification for Inclusion Dependency Normal Form

IEEE Transactions on Knowledge and Data Engineering
The Theory of Probabilistic Databases

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Normal Forms for Multidimensional Databases

SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Conceptual Design of Data Warehouses from E/R Schema

HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7

An information-theoretic approach to normal forms for relational and XML data

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Database component ware

ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
Linguistic based search facilities in snowflake-like database schemes

Data & Knowledge Engineering - NLDB2002
An information-theoretic approach to normal forms for relational and XML data

Journal of the ACM (JACM)
Theoretical and practical issues in evaluating the quality of conceptual models: current state and future directions

Data & Knowledge Engineering - Special issue: Quality in conceptual modeling
On redundancy vs dependency preservation in normalization: an information-theoretic study of 3NF

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Research in data warehouse modeling and design: dead or alive?

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
XML design for relational storage

Proceedings of the 16th international conference on World Wide Web
MedTAKMI-CDI: interactive knowledge discovery for clinical decision intelligence

IBM Systems Journal
Design Metrics for Data Warehouse Evolution

ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Generalized formal models for faceted user interfaces

Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Adaptive Web SitesA Knowledge Extraction from Web Data Approach

Proceedings of the 2008 conference on Adaptive Web Sites: A Knowledge Extraction from Web Data Approach
Classification spanning private databases

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
An information-theoretic analysis of worst-case redundancy in database design

ACM Transactions on Database Systems (TODS)
Business process management with the user requirements notation

Electronic Commerce Research
Data warehouse design on the basis of Hierarchical Degenerate Snowflake (HDS)

International Journal of Business Intelligence and Data Mining
A unified object constraint model for designing and implementing multidimensional systems

Journal on Data Semantics XIII
An ETL process for OLAP using RDF/OWL ontologies

Journal on Data Semantics XIII
Towards mining frequent queries in star schemes

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Repair-oriented relational schemas for multidimensional databases

Proceedings of the 15th International Conference on Extending Database Technology
Integrating Star and Snowflake Schemas in Data Warehouses

International Journal of Data Warehousing and Mining
Mining frequent conjunctive queries using functional and inclusion dependencies

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero.