A Restarted Strategy for Efficient Subsumption Testing

Authors:
Ondř/ej Kuž/elka;Filip Ž/elezný/
Affiliations:
(Correspd.) Intelligent Data Analysis Research Group, Department of Cybernetics, Czech Technical University in Prague, Prague, Czech Republic. kuzelo1@fel.cvut.cz/ zelezny@fel.cvut.cz;Intelligent Data Analysis Research Group, Department of Cybernetics, Czech Technical University in Prague, Prague, Czech Republic. kuzelo1@fel.cvut.cz/ zelezny@fel.cvut.cz
Venue:
Fundamenta Informaticae - Progress on Multi-Relational Data Mining
Year:
2009

Citing 9
Cited 4

A Study of Two Sampling Methods for Analyzing Large Datasets with ILP

Data Mining and Knowledge Discovery
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems

Journal of Automated Reasoning
Phase Transitions in Relational Learning

Machine Learning
Search in a Small World

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Formal Models of Heavy-Tailed Behavior in Combinatorial Search

CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
Fast Theta-Subsumption with Constraint Satisfaction Algorithms

Machine Learning
Statistical Regimes Across Constrainedness Regions

Constraints
Randomised restarted search in ILP

Machine Learning
Tractable induction and classification in first order logic via stochastic matching

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2

Fast estimation of first-order clause coverage through randomization and maximum likelihood

Proceedings of the 25th international conference on Machine learning
Taming the Complexity of Inductive Logic Programming

SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Seeing the world through homomorphism: an experimental study on reducibility of examples

ILP'10 Proceedings of the 20th international conference on Inductive logic programming
When does it pay off to use sophisticated entailment engines in ILP?

ILP'10 Proceedings of the 20th international conference on Inductive logic programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study runtime distributions of subsumption testing. On graph data randomly sampled from two different generative models we observe a gradual growth of the tails of the distributions as a function of the problem instance location in the phase transition space. To avoid the heavy tails, we design a randomized restarted subsumption testing algorithm RESUMER2. The algorithm is complete in that it correctly decides both subsumption and non-subsumption in finite time. A basic restarted strategy is augmented by allowing certain communication between odd and even restarts without losing the exponential runtime distribution decay guarantee resulting from mutual independence of restart pairs. We empirically test RESUMER2 against the state-of-the-art subsumption algorithm Django on generated graph data as well as on the predictive toxicology challenge (PTC) data set. RESUMER2 performs comparably with Django for relatively small examples (tens to hundreds of literals), while for further growing example sizes, RESUMER2 becomes vastly superior.