Deriving predicate statistics for logic rules

  • Authors:
  • Senlin Liang;Michael Kifer

  • Affiliations:
  • Department of Computer Science, Stony Brook University, Stony Brook, NY;Department of Computer Science, Stony Brook University, Stony Brook, NY

  • Venue:
  • RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database query optimizers rely on data statistics in selecting query execution plans and rule-based systems can greatly benefit from such optimizations as well. To this end, one first needs to collect data statistics for base and propagate them to derived predicates. However, there are two difficulties: dependencies among arguments and recursion. Earlier we developed an algorithm, called SDP, for estimating Datalog query sizes efficiently by estimating statistical dependency for both base and derived predicates [16]. Base predicate statistics were summarized as dependency matrices, while the statistics for derived predicate were estimated by abstract evaluation of rules over the dependency matrices. This previous work had several limitations. First, it only considered Datalog predicates. Second, only predicates of arity at most 2 were allowed--a very serious limitation of the approach. The present paper extends SDP to general rules and n-ary predicates. It also handles negation and mutual recursions as well as other operations. We also report on our experiments with SDP.