Representing Tuple and Attribute Uncertainty in Probabilistic Databases

  • Authors:
  • Prithviraj Sen;Amol Deshpande;Lise Getoor

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources -- sensor data, experimental data, data from uncurated sources, and many others. There is a grow- ing need to be able to flexibly represent the uncertainties in the data, and to efficiently query the data. Building on existing probabilistic database work, we present a unifying framework which allows a flexible representation of corre- lated tuple and attribute level uncertainties. An important capability of our representation is the ability to represent shared correlation structures in the data. We provide moti- vating examples to illustrate when such shared correlation structures are likely to exist. Representing shared corre- lations structures allows the use of sophisticated inference techniques based on lifted probabilistic inference that, in turn, allows us to achieve significant speedups while com- puting probabilities for results of user-submitted queries.