Referential integrity quality metrics

  • Authors:
  • Carlos Ordonez;Javier García-García

  • Affiliations:
  • University of Houston, Department of Computer Science, Houston, TX 77204, USA;Universidad Nacional Autónoma de México, Facultad de Ciencias, UNAM, Mexico City, CU 04510, Mexico

  • Venue:
  • Decision Support Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Referential integrity is an essential global constraint in a relational database, that maintains it in a complete and consistent state. In this work, we assume the database may violate referential integrity and relations may be denormalized. We propose a set of quality metrics, defined at four granularity levels: database, relation, attribute and value, that measure referential completeness and consistency. Quality metrics are efficiently computed with standard SQL queries, that incorporate two query optimizations: left outer joins on foreign keys and early foreign key grouping. Experiments evaluate our proposed metrics and SQL query optimizations on real and synthetic databases, showing they can help in detecting and explaining referential errors.