Scaling Up Inductive Logic Programming by Learning from Interpretations

  • Authors:
  • Hendrik Blockeel;Luc De Raedt;Nico Jacobs;Bart Demoen

  • Affiliations:
  • Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee, Belgium. hendrik.blockeel@cs.kuleuven.ac.be;Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee, Belgium. luc.deraedt@cs.kuleuven.ac.be;Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee, Belgium. nico.jacobs@cs.kuleuven.ac.be;Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee, Belgium. bart.demoen@cs.kuleuven.ac.be

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

When comparing inductive logic programming (ILP)and attribute-value learning techniques, there is a trade-offbetween expressive power and efficiency.Inductive logic programming techniques are typicallymore expressive but also less efficient.Therefore, the data sets handled by current inductive logic programmingsystems are small according to general standardswithin the data mining community.The main source of inefficiency lies in the assumption that severalexamples may be related to each other, so they cannot be handledindependently.Within the learning from interpretations framework for inductive logicprogramming thisassumption is unnecessary, which allows to scale up existing ILPalgorithms. In this paper we explain this learning setting in the contextof relational databases. We relate the setting to propositional data miningand to the classical ILP setting, and show that learning from interpretationscorresponds to learning from multiple relations and thus extends theexpressiveness of propositional learning, while maintaining itsefficiency to a large extent (which is not the case in the classicalILP setting).As a case study, we present two alternative implementations ofthe ILP system TILDE (Top-down Induction of Logical DEcision trees): TILDEclassic, which loads all data in main memory, and TILDELDS, which loads the examples one by one.We experimentally compare the implementations, showing TILDELDS canhandle large data sets (in the order of 100,000 examples or100 MB) and indeed scales up linearly in the number of examples.