Tractable database design and datalog abduction through bounded treewidth

  • Authors:
  • Georg Gottlob;Reinhard Pichler;Fang Wei

  • Affiliations:
  • Computing Laboratory, Oxford University, Oxford OX1 3QD, United Kingdom;Institut für Informationssysteme, Technische Universität Wien, A-1040 Vienna, Austria;Institut für Informatik, Albert-Ludwigs-Universität Freiburg, D-79110 Freiburg i. Br., Germany

  • Venue:
  • Information Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given that most elementary problems in database design are NP-hard, the currently used database design algorithms produce suboptimal results. For example, the current 3NF decomposition algorithms may continue further decomposing a relation even though it is already in 3NF. In this paper we study database design problems whose sets of functional dependencies have bounded treewidth. For such sets, we develop polynomial-time and highly parallelizable algorithms for a number of central database design problems such as:*primality of an attribute; *3NF-test for a relational schema or subschema; *BCNF-test for a subschema. In order to define the treewidth of a relational schema, we shall associate a hypergraph with it. Note that there are two main possibilities of defining the treewidth of a hypergraph H: One is via the primal graph of H and one is via the incidence graph of H. Our algorithms apply to the case where the primal graph is considered. However, we also show that the tractability results still hold when the incidence graph is considered instead. It turns out that our results have interesting applications to logic-based abduction. By the well-known relationship with the primality problem in database design and the relevance problem in propositional abduction, our new algorithms and tractability results can be easily carried over from the former field to the latter. Moreover, we show how these tractability results can be further extended from propositional abduction to abductive diagnosis based on non-ground datalog.