Maintaining consistency of probabilistic databases: a linear programming approach

Authors:
You Wu;Wilfred Ng
Affiliations:
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China;Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Venue:
ER'10 Proceedings of the 29th international conference on Conceptual modeling
Year:
2010

Citing 12
Cited 0

FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Generalized Normal Forms for Probabilistic Relational Data

IEEE Transactions on Knowledge and Data Engineering
Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time

Journal of the ACM (JACM)
Database repairing using updates

ACM Transactions on Database Systems (TODS)
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
BHUNT: automatic discovery of Fuzzy algebraic constraints in relational data

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Functional dependencies distorted by errors

Discrete Applied Mathematics
Orion 2.0: native support for uncertain data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximate Probabilistic Query Answering over Inconsistent Databases

ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Maintaining consistency of vague databases using data dependencies

Data & Knowledge Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
MayBMS: a probabilistic database management system

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of maintaining consistency via functional dependencies (FDs) has been studied and analyzed extensively within traditional database settings. There have also been many probabilistic data models proposed in the past decades. However, the problem of maintaining consistency in probabilistic relations via FDs is still unclear. In this paper, we clarify the concept of FDs in probabilistic relations and present an efficient chase algorithm LPChase(r,F) for maintaining consistency of a probabilistic relation r with respect to an FD set F. LPChase(r,F) adopts a novel approach that uses Linear Programming (LP) method to modify the probability of data values in r. There are many benefits of our approach. First, LPChase(r,F) guarantees that the output result is always the minimal change to r. Second, assuming that the expected size of an active domain consisting data values with non-zero probability is fixed, we demonstrate the interesting result that the LP solving time in LPChase(r,F) decreases as the probabilistic data domains grow, and becomes negligible for large domain size. On the other hand, the I/O time and modeling time become stable even when the domain size increases.