Representing Tuple and Attribute Uncertainty in Probabilistic Databases

Authors:
Prithviraj Sen;Amol Deshpande;Lise Getoor
Affiliations:
-;-;-
Venue:
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Year:
2007

Citing 0
Cited 7

Materialized views in probabilistic databases: for information exchange and query optimization

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Using OBDDs for Efficient Query Evaluation on Probabilistic Databases

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Querying web-based applications under models of uncertainty

Proceedings of the VLDB Endowment
$${10^{(10^{6})}}$$ worlds and beyond: efficient representation and processing of incomplete information

The VLDB Journal — The International Journal on Very Large Data Bases
Creating probabilistic databases from duplicated data

The VLDB Journal — The International Journal on Very Large Data Bases
PrDB: managing and exploiting rich correlations in probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic skylines on uncertain data: model and bounding-pruning-refining methods

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources -- sensor data, experimental data, data from uncurated sources, and many others. There is a grow- ing need to be able to flexibly represent the uncertainties in the data, and to efficiently query the data. Building on existing probabilistic database work, we present a unifying framework which allows a flexible representation of corre- lated tuple and attribute level uncertainties. An important capability of our representation is the ability to represent shared correlation structures in the data. We provide moti- vating examples to illustrate when such shared correlation structures are likely to exist. Representing shared corre- lations structures allows the use of sophisticated inference techniques based on lifted probabilistic inference that, in turn, allows us to achieve significant speedups while com- puting probabilities for results of user-submitted queries.