Determining the relative accuracy of attributes

  • Authors:
  • Yang Cao;Wenfei Fan;Wenyuan Yu

  • Affiliations:
  • School of Informatics, University of Edinburgh/ Big Data Research Center and SKLSDE Lab, Beihang University, Beijing, China;School of Informatics, University of Edinburgh/ Big Data Research Center and SKLSDE Lab, Beihang University, Edinburgh, United Kingdom;School of Informatics, University of Edinburgh, Edinburgh, United Kingdom

  • Venue:
  • Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The relative accuracy problem is to determine, given tuples t1 and t2 that refer to the same entity e, whether t1[A] is more accurate than t2A, i.e., t1A is closer to the true value of the A attribute of e than t2A. This has been a longstanding issue for data quality, and is challenging when the true values of e are unknown. This paper proposes a model for determining relative accuracy. (1) We introduce a class of accuracy rules and an inference system with a chase procedure, to deduce relative accuracy. (2) We identify and study several fundamental problems for relative accuracy. Given a set Ie of tuples pertaining to the same entity e and a set of accuracy rules, these problems are to decide whether the chase process terminates, is Church-Rosser, and leads to a unique target tuple te composed of the most accurate values from Ie for all the attributes of e. (3) We propose a framework for inferring accurate values with user interaction. (4) We provide algorithms underlying the framework, to find the unique target tuple te whenever possible; when there is no enough information to decide a complete te, we compute top-k candidate targets based on a preference model. (5) Using real-life and synthetic data, we experimentally verify the effectiveness and efficiency of our method.