An extensible metadata framework for data quality assessment of composite structures

  • Authors:
  • José Farinha;Maria José Trigueiros

  • Affiliations:
  • ISCTE/ADETTI, Department of Science and Information Technology, Lisbon, Portugal;ISCTE/ADETTI, Department of Science and Information Technology, Lisbon, Portugal

  • Venue:
  • DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data quality is a critical issue both in operational databases and in data warehouse systems. Data quality assessment is a strong requirement regarding the ETL subsystem, since bad data may destroy data warehouse credibility. During the last two decades, research and development efforts in the data quality field have produced techniques for data profiling and cleaning, which focus on detecting and correcting bad values in data. Little efforts have been done considering data quality when it relates to the well-formedness of coarse grained data structures resulting from the assembly of linked data records. This paper proposes a metadata model that supports the structural validation of linked data records, from a data quality point of view. The metamodel is built on top of the CWM standard and it supports the specification of data structure quality rules in a high level of abstraction, as well as by means of very specific fine grained business rules.