Editorial: Efficient discovery of similarity constraints for matching dependencies

  • Authors:
  • Shaoxu Song;Lei Chen

  • Affiliations:
  • Key Laboratory for Information System Security, MOE, School of Software, Tsinghua University, China and TNList, School of Software, Tsinghua University, China;Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The concept of matching dependencies (mds) has recently been proposed for specifying matching rules for object identification. Similar to the functional dependencies (with conditions), mds can also be applied to various data quality applications such as detecting the violations of integrity constraints. In this paper, we study the problem of discovering similarity constraints for matching dependencies from a given database instance. First, we introduce the measures, support and confidence, for evaluating the utility of mds in the given data. Then, we study the discovery of mds with certain utility requirements of support and confidence. Exact algorithms are developed, together with pruning strategies to improve the time performance. Since the exact algorithm has to traverse all the data during the computation, we propose an approximate solution which only uses part of the data. A bound of relative errors introduced by the approximation is also developed. Finally, our experimental evaluation demonstrates the efficiency of the proposed methods.