The Effect of Relational Background Knowledge on Learning of Protein Three-Dimensional Fold Signatures

  • Authors:
  • Marcel Turcotte;Stephen H. Muggleton;Michael J. E. Sternberg

  • Affiliations:
  • Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, P.O. Box 123, London WC2A 3PX, UK. m.turcotte@icrf.icnet.uk;Department of Computer Science, University of York, Heslington, York, YO1 5DD, UK. stephen@cs.york.ac.uk;Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, P.O. Box 123, London WC2A 3PX, UK. m.sternberg@icrf.icnet.uk

  • Venue:
  • Machine Learning
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

As a form of Machine Learning the study of Inductive Logic Programming (ILP) is motivated by a central belief: relational description languages are better (in terms of accuracy and understandability) than propositional ones for certain real-world applications. This claim is investigated here for a particular application in structural molecular biology, that of constructing readable descriptions of the major protein folds. To the authors' knowledge Machine Learning has not previously been applied systematically to this task. In this application, the domain expert (third author) identified a natural divide between essentially propositional features and more structurally-oriented relational ones. The following null hypotheses are tested: 1) for a given ILP system (Progol) provision of relational background knowledge does not increase predictive accuracy, 2) a good propositional learning system (C5.0) without relational background knowledge will outperform Progol with relational background knowledge, 3) relational background knowledge does not produce improved explanatory insight. Null hypotheses 1) and 2) are both refuted on cross-validation results carried out over 20 of the most populated protein folds. Hypothesis 3 is refuted by demonstration of various insightful rules discovered only in the relationally-oriented learned rules.