Maintaining consistency of probabilistic databases: a linear programming approach

  • Authors:
  • You Wu;Wilfred Ng

  • Affiliations:
  • Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China;Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China

  • Venue:
  • ER'10 Proceedings of the 29th international conference on Conceptual modeling
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of maintaining consistency via functional dependencies (FDs) has been studied and analyzed extensively within traditional database settings. There have also been many probabilistic data models proposed in the past decades. However, the problem of maintaining consistency in probabilistic relations via FDs is still unclear. In this paper, we clarify the concept of FDs in probabilistic relations and present an efficient chase algorithm LPChase(r,F) for maintaining consistency of a probabilistic relation r with respect to an FD set F. LPChase(r,F) adopts a novel approach that uses Linear Programming (LP) method to modify the probability of data values in r. There are many benefits of our approach. First, LPChase(r,F) guarantees that the output result is always the minimal change to r. Second, assuming that the expected size of an active domain consisting data values with non-zero probability is fixed, we demonstrate the interesting result that the LP solving time in LPChase(r,F) decreases as the probabilistic data domains grow, and becomes negligible for large domain size. On the other hand, the I/O time and modeling time become stable even when the domain size increases.