Faster query answering in probabilistic databases using read-once functions

  • Authors:
  • Sudeepa Roy;Vittorio Perduca;Val Tannen

  • Affiliations:
  • University of Pennsylvania;University of Pennsylvania;University of Pennsylvania

  • Venue:
  • Proceedings of the 14th International Conference on Database Theory
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A boolean expression is in read-once form if each of its variables appears exactly once. When the variables denote independent events in a probability space, the probability of the event denoted by the whole expression in read-once form can be computed in polynomial time (whereas the general problem for arbitrary expressions is #P-complete). Known approaches to checking read-once property seem to require putting these expressions in disjunctive normal form. In this paper, we tell a better story for a large subclass of boolean event expressions: those that are generated by conjunctive queries without self-joins and on tuple-independent probabilistic databases. We first show that given a tuple-independent representation and the provenance graph of an SPJ query plan without self-joins, we can, without using the DNF of a result event expression, efficiently compute its co-occurrence graph. From this, the read-once form can already, if it exists, be computed efficiently using existing techniques. Our second and key contribution is a complete, efficient, and simple to implement algorithm for computing the read-once forms (whenever they exist) directly, using a new concept, that of co-table graph, which can be significantly smaller than the cooccurrence graph.