IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms

  • Authors:
  • Walter Daelemans;Antal Van Den Bosch;Ton Weijters

  • Affiliations:
  • Computational Linguistics, Tilburg University, The Netherlands. E-mail: Walter.Daelemans@kub.nl;MATRIKS, Maastricht University, The Netherlands. E-mail: antal@cs.unimmas.nl, weijters@cs.unimmas.nl;MATRIKS, Maastricht University, The Netherlands. E-mail: antal@cs.unimmas.nl, weijters@cs.unimmas.nl

  • Venue:
  • Artificial Intelligence Review - Special issue on lazy learning
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the IGTree learning algorithm, which compresses aninstance base into a tree structure. The concept of information gainis used as a heuristic function for performing this compression.IGTree produces trees that, compared to other lazy learningapproaches, reduce storage requirements and the time required tocompute classifications. Furthermore, we obtained similar or bettergeneralization accuracy with IGTree when trained on two complexlinguistic tasks, viz. letter–phoneme transliteration andpart-of-speech-tagging, when compared to alternative lazy learning anddecision tree approaches (viz., IB1, information-gain-weighted IB1,and C4.5). A third experiment, with the task of word hyphenation,demonstrates that when the mutual differences in information gain offeatures is too small, IGTree as well as information-gain-weighted IB1perform worse than IB1. These results indicate that IGTree is auseful algorithm for problems characterized by the availability of alarge number of training instances described by symbolic featureswith sufficiently differing information gain values.