Fast estimation of first-order clause coverage through randomization and maximum likelihood

Authors:
Ondřej Kuželka;Filip Železný
Affiliations:
Czech Technical University in Prague, Czech Republic;Czech Technical University in Prague, Czech Republic
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 9
Cited 3

Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems

Journal of Automated Reasoning
Phase Transitions in Relational Learning

Machine Learning
Fast Theta-Subsumption with Constraint Satisfaction Algorithms

Machine Learning
Statistical Regimes Across Constrainedness Regions

Constraints
Randomised restarted search in ILP

Machine Learning
Learning Horn Expressions with LOGAN-H

The Journal of Machine Learning Research
A Restarted Strategy for Efficient Subsumption Testing

Fundamenta Informaticae - Progress on Multi-Relational Data Mining
Tractable induction and classification in first order logic via stochastic matching

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2

Taming the Complexity of Inductive Logic Programming

SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
ProGolem: a system based on relative minimal generalisation

ILP'09 Proceedings of the 19th international conference on Inductive logic programming
When does it pay off to use sophisticated entailment engines in ILP?

ILP'10 Proceedings of the 20th international conference on Inductive logic programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

In inductive logic programming, θ-subsumption is a widely used coverage test. Unfortunately, testing θ-subsumption is NP-complete, which represents a crucial efficiency bottleneck for many relational learners. In this paper, we present a probabilistic estimator of clause coverage, based on a randomized restarted search strategy. Under a distribution assumption, our algorithm can estimate clause coverage without having to decide subsumption for all examples. We implement this algorithm in program ReCovEr. On generated graph data and real-world datasets, we show that ReCovEr provides reasonably accurate estimates while achieving dramatic runtimes improvements compared to a state-of-the-art algorithm.