Feature construction with Inductive Logic Programming: A Study of Quantitative Predictions of Biological Activity Aided by Structural Attributes

Authors:
Ashwin Srinivasan;Ross D. King
Affiliations:
Oxford University Computing Laboratory, Oxford UK. ashwin@comlab.ox.ac.uk;Department of Computer Science, University of Wales, Aberystwyth, Wales UK. rdk@aber.ac.uk
Venue:
Data Mining and Knowledge Discovery
Year:
1999

Citing 12
Cited 28

Foundations of logic programming

Foundations of logic programming
Structured induction in expert systems

Structured induction in expert systems
Comparing systems and analyzing functions to improve constructive induction

Proceedings of the sixth international workshop on Machine learning
Inductive logic programming

New Generation Computing - Selected papers from the international workshop on algorithmic learning theory,1990
Interactive Concept-Learning and Constructive Induction by Analogy

Machine Learning
Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
First Order Regression

Machine Learning - special issue on inductive logic programming
Inductive Logic Programming: Techniques and Applications

Inductive Logic Programming: Techniques and Applications
Learning Logical Definitions from Relations

Machine Learning
Feature Construction with Inductive Logic Programming: A Study of Quantitative Predictions of Biological Activity by Structural Attributes

ILP '96 Selected Papers from the 6th International Workshop on Inductive Logic Programming
Stochastic Propositionalization of Non-determinate Background Knowledge

ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Structural regression trees

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Ontology-guided knowledge discovery in databases

Proceedings of the 1st international conference on Knowledge capture
Molecular feature mining in HIV data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Understanding the Crucial Role of AttributeInteraction in Data Mining

Artificial Intelligence Review
Propositionalization approaches to relational data mining

Relational Data Mining
Four suggestions and a rule concerning the application of ILP

Relational Data Mining
Demand-Driven Construction of Structural Features in ILP

ILP '01 Proceedings of the 11th International Conference on Inductive Logic Programming
Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Statistical Relational Learning for Document Mining

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
An Efficient Algorithm for Discovering Frequent Subgraphs

IEEE Transactions on Knowledge and Data Engineering
Frequent Substructure-Based Approaches for Classifying Chemical Compounds

IEEE Transactions on Knowledge and Data Engineering
Applications of machine learning: matching problems to tasks and methods

The Knowledge Engineering Review
Quantitative pharmacophore models with inductive logic programming

Machine Learning
A Dichotomic Search Algorithm for Mining and Learning in Domain-Specific Logics

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Learning Qualitative Models of Physical and Biological Systems

Computational Discovery of Scientific Knowledge
Logic and the Automatic Acquisition of Scientific Knowledge: An Application to Functional Genomics

Computational Discovery of Scientific Knowledge
Feature Construction Using Theory-Guided Sampling and Randomised Search

ILP '08 Proceedings of the 18th international conference on Inductive Logic Programming
Multirelational classification: a multiple view approach

Knowledge and Information Systems
Evolutionary multi-feature construction for data reduction: A case study

Applied Soft Computing
Learning first-order Bayesian networks

AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
RSD: relational subgroup discovery through first-order feature construction

ILP'02 Proceedings of the 12th international conference on Inductive logic programming
Lattice-search runtime distributions may be heavy-tailed

ILP'02 Proceedings of the 12th international conference on Inductive logic programming
Towards a general framework for data mining

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
A general multi-relational classification approach using feature generation and selection

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
An ontology based framework for mining dependence relationships between news and financial instruments

Expert Systems with Applications: An International Journal
Data summarization approach to relational domain learning based on frequent pattern to support the development of decision making

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Support vector inductive logic programming

DS'05 Proceedings of the 8th international conference on Discovery Science
A Dichotomic Search Algorithm for Mining and Learning in Domain-Specific Logics

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Inferring ECA-based rules for ambient intelligence using evolutionary feature extraction

Journal of Ambient Intelligence and Smart Environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, computer programs developed within the field of InductiveLogic Programming (ILP) have received some attention for their abilityto construct restricted first-order logic solutions using problem-specificbackground knowledge. Prominent applications of such programs have beenconcerned with determining “structure-activity” relationships inthe areas of molecular biology and chemistry. Typically the task hereis to predict the “activity” of a compound (for example, toxicity), from itschemical structure. A summary of the research in the area is:(a) ILP programs have largely been restricted to qualitative predictionsof activity (“high”, “low” etc.);(b) When appropriate attributes are available, ILP programs have equivalentpredictivity to standard quantitative analysistechniques like linear regression. However ILP programs usually perform betterwhen such attributes are unavailable; and (c) By using structural informationas background knowledge, an ILP program can provide comprehensible explanationsfor biological activity.This paper examines the use of ILP programsas a method of “discovering” new attributes.These attributes could then be used by methods like linear regression,thus allowing for quantitative predictionswhile retaining the ability to use structural information as backgroundknowledge. Using structure-activity tasks as a test-bed, the utility of ILPprograms in constructing new features was evaluated by examining the predictionof biological activity using linear regression, with and without the aid of ILPlearnt logical attributes. In three out of the five data sets examined the additionof ILP attributes produced statistically better results. In additionsix important structural features that have escaped the attention of the expertchemists were discovered. The method used here to construct new attributesis not specific to the problem of predicting biological activity, andthe results obtained suggest a wider role for ILP programs in aidingthe process of scientific discovery.