Repairing OLAP queries in databases with referential integrity errors

  • Authors:
  • Javier García-García;Carlos Ordonez

  • Affiliations:
  • Instituto Politécnico Nacional & Universidad Nacional Autónoma de México, Mexico City, Mexico;University of Houston, Houston, TX , USA

  • Venue:
  • DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many database applications and OLAP tools dynamically generate SQL queries involving join operators and aggregate functions and send these queries to a database server for execution. This dynamically generated SQL code normally assumes the underlying tables and columns are clean and lacks the necessary robustness to deal with foreign keys with null and invalid or undefined values that are ubiquitous in databases with inconsistent or incomplete content. The outcome is that at query time, several issues arise mostly as inconsistencies in answer sets, difficult to detect and explain by users of OLAP tools. In this article, we present an automated query rewriting method for automatically generated OLAP queries that are executed over tables with foreign key columns having potentially null or invalid values. Our method is applicable in queries that use join operators and aggregate functions obeying the summarizability property (e.g. sum(), count()). If a user of an OLAP tool wants or requests it, using our method the queries that use join operators may be rewritten and he or she may be warned of the referential integrity condition of the underlying database and the answer sets may present alternative consistent results in the case aggregate functions are involved. Preliminary experimental evaluation shows rewritten queries provide valuable information on referential integrity and take almost the same time as original queries, highlighting efficiency is good and overhead is minimal.