Deriving predicate statistics for logic rules

Authors:
Senlin Liang;Michael Kifer
Affiliations:
Department of Computer Science, Stony Brook University, Stony Brook, NY;Department of Computer Science, Stony Brook University, Stony Brook, NY
Venue:
RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems
Year:
2012

Citing 26
Cited 1

Equi-depth multidimensional histograms

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Estimating the size of generalized transitive closures

VLDB '89 Proceedings of the 15th international conference on Very large data bases
On the expected size of recursive Datalog queries

PODS '91 Proceedings of the tenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On the propagation of errors in the size of join results

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Balancing histogram optimality and practicality for query result size estimation

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Rapid bushy join-order optimization with Cartesian products

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Selectivity estimation in spatial databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation

ACM Transactions on Database Systems (TODS)
Independence is good: dependency-based histogram synopses for high-dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Exploiting statistics on query expressions for optimization

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
The CORAL deductive system

The VLDB Journal — The International Journal on Very Large Data Bases - Prototypes of deductive database systems
Measuring the Complexity of Join Enumeration in Query Optimization

VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer

Proceedings of the 27th International Conference on Very Large Data Bases
Universality of Serial Histograms

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Join Enumeration in a Memory-Constrained Environment

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Database Systems: An Application Oriented Approach, Complete Version (2nd Edition)

Database Systems: An Application Oriented Approach, Complete Version (2nd Edition)
Graph-based synopses for relational selectivity estimation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Optimal top-down join enumeration

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Dynamic programming strikes back

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Adding magic to an optimising datalog compiler

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Deriving predicate statistics in datalog

Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programming

Non-termination analysis and cost-based query optimization of logic programs

RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Database query optimizers rely on data statistics in selecting query execution plans and rule-based systems can greatly benefit from such optimizations as well. To this end, one first needs to collect data statistics for base and propagate them to derived predicates. However, there are two difficulties: dependencies among arguments and recursion. Earlier we developed an algorithm, called SDP, for estimating Datalog query sizes efficiently by estimating statistical dependency for both base and derived predicates [16]. Base predicate statistics were summarized as dependency matrices, while the statistics for derived predicate were estimated by abstract evaluation of rules over the dependency matrices. This previous work had several limitations. First, it only considered Datalog predicates. Second, only predicates of arity at most 2 were allowed--a very serious limitation of the approach. The present paper extends SDP to general rules and n-ary predicates. It also handles negation and mutual recursions as well as other operations. We also report on our experiments with SDP.