Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Inductive logic programming: derivations, successes and shortcomings
ACM SIGART Bulletin
Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Fractional-Step Dimensionality Reduction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Foundations of Inductive Logic Programming
Foundations of Inductive Logic Programming
A Study of Two Sampling Methods for Analyzing Large Datasets with ILP
Data Mining and Knowledge Discovery
Learning Logical Definitions from Relations
Machine Learning
An adaptation of Relief for attribute estimation in regression
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature subset selection using a new definition of classifiability
Pattern Recognition Letters
Filtering Multi-Instance Problems to Reduce Dimensionality in Relational Learning
Journal of Intelligent Information Systems
An introduction to variable and feature selection
The Journal of Machine Learning Research
Query transformations for improving the efficiency of ilp systems
The Journal of Machine Learning Research
Tractable induction and classification in first order logic via stochastic matching
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Oversearching and layered search in empirical learning
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Lattice-search runtime distributions may be heavy-tailed
ILP'02 Proceedings of the 12th international conference on Inductive logic programming
The feature selection problem: traditional methods and a new algorithm
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Bayes-optimality motivated linear and multilayered perceptron-based dimensionality reduction
IEEE Transactions on Neural Networks
The Journal of Machine Learning Research
Parallel ILP for distributed-memory architectures
Machine Learning
Hi-index | 0.00 |
Given sample data and background knowledge encoded in the form of logic programs, a predictive Inductive Logic Programming (ILP) system attempts to find a set of rules (or clauses) for predicting classification labels in the data. Most present-day systems for this purpose rely on some variant of a generate-and-test procedure that repeatedly examines a set of potential candidates (termed here as the “hypothesis space”). On each iteration a search procedure is employed to find the “best” clause. The worst-case time-complexity of such systems depends critically on: (1) the size of the hypothesis spaces examined; and (2) the cost of estimating the goodness of a clause. To date, attempts to improve the efficiency of such ILP systems have concentrated either on examining fewer clauses within a given hypothesis space, or on efficient means of estimating the goodness of clauses. The principal means of restricting the size of the hypothesis space itself has been through the use of language and search constraints. Given such constraints, this paper is concerned with investigating the use of a dimensionality reduction method to reduce further the size of the hypothesis space. Specifically, for a particular kind of ILP system, clauses in the search space are represented as points in a high-dimension space. Using a sample of points from this geometric space, feature selection is used to discard dimensions of little or no (statistical) relevance. The resulting lower dimensional space translates directly, in the worst-case, to a smaller hypothesis space. We evaluate this approach on one controlled domain (graphs) and two real-life datasets concerning problems from biochemistry (mutagenesis and carcinogenesis). In each case, we obtain unbiased estimates of the size of the hypothesis space before and after feature selection; and compare the the results obtained using a complete search of the two spaces.