Feature Construction Using Theory-Guided Sampling and Randomised Search

Authors:
Sachindra Joshi;Ganesh Ramakrishnan;Ashwin Srinivasan
Affiliations:
IBM India Research Laboratory, India 110070;IBM India Research Laboratory, India 110070;IBM India Research Laboratory, India 110070 and Dept. of Computer Science and Engineering & Centre for Health Informatics, University of New South Wales, Sydney
Venue:
ILP '08 Proceedings of the 18th international conference on Inductive Logic Programming
Year:
2008

Citing 11
Cited 1

PAC-learnability of determinate logic programs

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Genome scale prediction of protein functional class from sequence using data mining

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature construction with Inductive Logic Programming: A Study of Quantitative Predictions of Biological Activity Aided by Structural Attributes

Data Mining and Knowledge Discovery
Principle-based parsing without overgeneration

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Efficient Construction of Relational Features

ICMLA '05 Proceedings of the Fourth International Conference on Machine Learning and Applications
Randomised restarted search in ILP

Machine Learning
Word Sense Disambiguation Using Inductive Logic Programming

Inductive Logic Programming
kFOIL: learning simple relational kernels

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Change of representation for statistical relational learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using ILP to construct features for information extraction from semi-structured text

ILP'07 Proceedings of the 17th international conference on Inductive logic programming

Data-based research at IIT Bombay

ACM SIGMOD Record

Quantified Score

Hi-index	0.00

Visualization

Abstract

It has repeatedly been found that very good predictive models can result from using Boolean features constructed by an an Inductive Logic Programming (ILP) system with access to relevant relational information. The process of feature construction by an ILP system, sometimes called "propositionalization", has been largely done either as a pre-processing step (in which a large set of possibly useful features are constructed first, and then a predictive model is constructed) or by tightly coupling feature construction and model construction (in which a predictive model is constructed with each new feature, and only those that result in a significant improvement in performance are retained). These represent two extremes, similar in spirit to filter and wrapper-based approaches to feature selection. An interesting, third perspective on the problem arises by taking search-based view of feature construction. In this, we conceptually view the task as searching through subsets of all possible features that can be constructed by the ILP system. Clearly an exhaustive search of such a space will usually be intractable. We resort instead to a randomised local search which repeatedly constructs randomly (but non-uniformly) a subset of features and then performs a greedy local search starting from this subset. The number of possible features usually prohibits an enumeration of all local moves. Consequently, the next move in the search-space is guided by the errors made by the model constructed using the current set of features. This can be seen as sampling non-uniformly from the set of all possible local moves, with a view of selecting only those capable of improving performance. The result is a procedure in which a feature subset is initially generated in the pre-processing style, but further alterations are guided actively by actual model predictions. We test this procedure on language processing task of word-sense disambiguation. Good models have previously been obtained for this task using an SVM in conjunction with ILP features constructed in the pre-processing style. Our results show an improvement on these previous results: predictive accuracies are usually higher, and substantially fewer features are needed.