Empirical studies to assess the understandability of data warehouse schemas using structural metrics

Authors:
Manuel Angel Serrano;Coral Calero;Houari A. Sahraoui;Mario Piattini
Affiliations:
Alarcos Research Group --- Department of Information Technologies and Systems, Universidad de Castilla --- La Mancha, Ciudad Real, Spain 13071;Alarcos Research Group --- Department of Information Technologies and Systems, Universidad de Castilla --- La Mancha, Ciudad Real, Spain 13071;Alarcos Research Group --- Department of Information Technologies and Systems, Universidad de Castilla --- La Mancha, Ciudad Real, Spain 13071 and Dep. d'Informatique et de Recherche Opératio ...;Alarcos Research Group --- Department of Information Technologies and Systems, Universidad de Castilla --- La Mancha, Ciudad Real, Spain 13071
Venue:
Software Quality Control
Year:
2008

Citing 0
Cited 5

Towards readable layouts for modeling data warehouses

DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Assessing the maintainability of software product line feature models using structural metrics

Software Quality Control
Decision support for the software product line domain engineering lifecycle

Automated Software Engineering
Complexity metric for multidimensional models for data warehouse

Proceedings of the CUBE International Information Technology Conference
Effective data warehouse for information delivery: a literature survey and classification

International Journal of Networking and Virtual Organisations

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data warehouses are powerful tools for making better and faster decisions in organizations where information is an asset of primary importance. Due to the complexity of data warehouses, metrics and procedures are required to continuously assure their quality. This article describes an empirical study and a replication aimed at investigating the use of structural metrics as indicators of the understandability, and by extension, the cognitive complexity of data warehouse schemas. More specifically, a four-step analysis is conducted: (1) check if individually and collectively, the considered metrics can be correlated with schema understandability using classical statistical techniques, (2) evaluate whether understandability can be predicted by case similarity using the case-based reasoning technique, (3) determine, for each level of understandability, the subsets of metrics that are important by means of a classification technique, and assess, by means of a probabilistic technique, the degree of participation of each metric in the understandability prediction. The results obtained show that although a linear model is a good approximation of the relation between structure and understandability, the associated coefficients are not significant enough. Additionally, classification analyses reveal respectively that prediction can be achieved by considering structure similarity, that extracted classification rules can be used to estimate the magnitude of understandability, and that some metrics such as the number of fact tables have more impact than others.