Extending inclusion dependencies with conditions

  • Authors:
  • Shuai Ma;Wenfei Fan;Loreto Bravo

  • Affiliations:
  • SKLSDE Lab, Beihang University, China;SKLSDE Lab, Beihang University, China and University of Edinburgh, UK;University of Concepcion, Chile

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2014

Quantified Score

Hi-index 5.23

Visualization

Abstract

This paper introduces a class of conditional inclusion dependencies (CINDs), which extends inclusion dependencies (INDs) by enforcing patterns of semantically related data values. We show that CINDs are useful not only in data cleaning, but also in contextual schema matching. We give a full treatment of the static analysis of CINDs, and show that CINDs retain most desired properties of traditional INDs: (a) CINDs are always satisfiable; (b) CINDs are finitely axiomatizable, i.e., there exists a sound and complete inference system for the implication analysis of CINDs; and (c) the implication problem for CINDs has the same complexity as its traditional counterpart, namely, PSPACE-complete, in the absence of attributes with a finite domain; but it is EXPTIME-complete in the general setting. In addition, we investigate the interaction between CINDs and conditional functional dependencies (CFDs), as well as two practical fragments of CINDs, namely acyclic CINDs and unary CINDs. We show the following: (d) the satisfiability problem for the combination of CINDs and CFDs becomes undecidable, even in the absence of finite-domain attributes; (e) in the absence of finite-domain attributes, the implication problem for acyclic CINDs and for unary CINDs retains the same complexity as its traditional counterpart, namely, NP-complete and PTIME, respectively; but in the general setting, it becomes PSPACE-complete and coNP-complete, respectively; and (f) the implication problem for acyclic unary CINDs remains in PTIME in the absence of finite-domain attributes and coNP-complete in the general setting.