Efficient sampling in relational feature spaces

Authors:
Filip Železný
Affiliations:
Czech Technical University in Prague, Prague 6, Czech Republic
Venue:
ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Year:
2005

Citing 7
Cited 0

Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
An extended transformation approach to inductive logic programming

ACM Transactions on Computational Logic (TOCL) - Special issue devoted to Robert A. Kowalski
Process-Oriented Estimation of Generalization Error

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
The complexity of satisfiability problems

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Query transformations for improving the efficiency of ilp systems

The Journal of Machine Learning Research
Tractable induction and classification in first order logic via stochastic matching

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Lattice-search runtime distributions may be heavy-tailed

ILP'02 Proceedings of the 12th international conference on Inductive logic programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

State-of-the-art algorithms implementing the ‘extended transformation approach' to propositionalization use backtrack depth first search for the construction of relational features (first order atom conjunctions) complying to user's mode/type declarations and a few basic syntactic conditions. As such they incur a complexity factor exponential in the maximum allowed feature size. Here I present an alternative based on an efficient reduction of the feature construction problem on the propositional satisfiability (SAT) problem, such that the latter involves only Horn clauses and is therefore tractable: a model to a propositional Horn theory can be found without backtracking in time linear in the number of literals contained. This reduction allows to either efficiently enumerate the complete set of correct features (if their total number is polynomial in the maximum feature size), or otherwise efficiently obtain a random sample from the uniform distribution on the feature space. The proposed sampling method can also efficiently provide an unbiased estimate of the total number of correct features entailed by the user language declaration.