SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Generalized Normal Forms for Probabilistic Relational Data
IEEE Transactions on Knowledge and Data Engineering
Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time
Journal of the ACM (JACM)
Database repairing using updates
ACM Transactions on Database Systems (TODS)
Efficient query evaluation on probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
BHUNT: automatic discovery of Fuzzy algebraic constraints in relational data
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Functional dependencies distorted by errors
Discrete Applied Mathematics
Orion 2.0: native support for uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximate Probabilistic Query Answering over Inconsistent Databases
ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Maintaining consistency of vague databases using data dependencies
Data & Knowledge Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
MayBMS: a probabilistic database management system
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Hi-index | 0.00 |
The problem of maintaining consistency via functional dependencies (FDs) has been studied and analyzed extensively within traditional database settings. There have also been many probabilistic data models proposed in the past decades. However, the problem of maintaining consistency in probabilistic relations via FDs is still unclear. In this paper, we clarify the concept of FDs in probabilistic relations and present an efficient chase algorithm LPChase(r,F) for maintaining consistency of a probabilistic relation r with respect to an FD set F. LPChase(r,F) adopts a novel approach that uses Linear Programming (LP) method to modify the probability of data values in r. There are many benefits of our approach. First, LPChase(r,F) guarantees that the output result is always the minimal change to r. Second, assuming that the expected size of an active domain consisting data values with non-zero probability is fixed, we demonstrate the interesting result that the LP solving time in LPChase(r,F) decreases as the probabilistic data domains grow, and becomes negligible for large domain size. On the other hand, the I/O time and modeling time become stable even when the domain size increases.