Attribute and object selection queries on objects with probabilistic attributes

  • Authors:
  • Rabia Nuray-Turan;Dmitri V. Kalashnikov;Sharad Mehrotra;Yaming Yu

  • Affiliations:
  • University of California, Irvine;University of California, Irvine;University of California, Irvine;University of California, Irvine

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern data processing techniques such as entity resolution, data cleaning, information extraction, and automated tagging often produce results consisting of objects whose attributes may contain uncertainty. This uncertainty is frequently captured in the form of a set of multiple mutually exclusive value choices for each uncertain attribute along with a measure of probability for alternative values. However, the lay end-user, as well as some end-applications, might not be able to interpret the results if outputted in such a form. Thus, the question is how to present such results to the user in practice, for example, to support attribute-value selection and object selection queries the user might be interested in. Specifically, in this article we study the problem of maximizing the quality of these selection queries on top of such a probabilistic representation. The quality is measured using the standard and commonly used set-based quality metrics. We formalize the problem and then develop efficient approaches that provide high-quality answers for these queries. The comprehensive empirical evaluation over three different domains demonstrates the advantage of our approach over existing techniques.