IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms

Authors:
Walter Daelemans;Antal Van Den Bosch;Ton Weijters
Affiliations:
Computational Linguistics, Tilburg University, The Netherlands. E-mail: Walter.Daelemans@kub.nl;MATRIKS, Maastricht University, The Netherlands. E-mail: antal@cs.unimmas.nl, weijters@cs.unimmas.nl;MATRIKS, Maastricht University, The Netherlands. E-mail: antal@cs.unimmas.nl, weijters@cs.unimmas.nl
Venue:
Artificial Intelligence Review - Special issue on lazy learning
Year:
1997

Citing 13
Cited 34

Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
Constructing a generalizer superior to NETtalk via a mathematical theory of generalization

Neural Networks
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Instance-Based Learning Algorithms

Machine Learning
Bumptrees for efficient function, constraint, and classification learning

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Generalizing from case studies: a case study

ML92 Proceedings of the ninth international workshop on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Using k-d Trees to Improve the Retrieval Step in Case-Based Reasoning

EWCBR '93 Selected papers from the First European Workshop on Topics in Case-Based Reasoning
Memory-Based Lexical Acquisition and Processing

Proceedings of the Third International EAMT Workshop on Machine Translation and the Lexicon
Data-oriented methods for grapheme-to-phoneme conversion

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Oblivious decision trees graphs and top down pruning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Multiresolution instance-based learning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Forgetting Exceptions is Harmful in Language Learning

Machine Learning - Special issue on natural language learning
Unsupervised Learning of Word Segmentation Rules with Genetic Algorithms and Inductive Logic Programming

Machine Learning
A multistrategy approach to improving pronunciation by analogy

Computational Linguistics
Improving accuracy in word class tagging through the combination of machine learning systems

Computational Linguistics
Parameter optimization for machine-learning of word sense disambiguation

Natural Language Engineering
Complex answers: a case study using a WWW question answering system

Natural Language Engineering
Simplifying decision trees: A survey

The Knowledge Engineering Review
Exploring the use of linguistic features in domain and genre classification

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Memory-based learning: using similarity for smoothing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Improving data driven wordclass tagging by system combination

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A rule induction approach to modeling regional pronunciation variation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Memory-based morphological analysis

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Detecting problematic turns in human-machine interactions: rule-induction versus memory-based learning approaches

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Pronunciation by analogy in normal and impaired readers

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Using induced rules as complex features in memory-based language learning

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Generating synthetic speech prosody with lazy learning in tree structures

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Single-classifier memory-based phrase chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Dutch word sense disambiguation: optimizing the localness of context

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Applications of corpus-based semantic similarity and word segmentation to database schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient context-sensitive word completion for mobile devices

Proceedings of the 10th international conference on Human computer interaction with mobile devices and services
Multilingual pronunciation by analogy

Natural Language Engineering
A Generalized Approach to Word Segmentation Using Maximum Length Descending Frequency and Entropy Rate

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Abstraction is harmful in language learning

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Modularity in inductively-learned word pronunciation systems

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Do not forget: full memory in memory-based learning of word pronunciation

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
On the syllabification of phonemes

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
All-word prediction as the ultimate confusable disambiguation

CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Language models for contextual error detection and correction

CLAGI '09 Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference
The role of PP attachment in preposition generation

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Integrating source-language context into phrase-based statistical machine translation

Machine Translation
Intelligent techniques for web personalization

ITWP'03 Proceedings of the 2003 international conference on Intelligent Techniques for Web Personalization
The effect of domain and text type on text prediction quality

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
A hybrid approach to finding negated and uncertain expressions in biomedical documents

Proceedings of the 2nd international workshop on Managing interoperability and compleXity in health systems
Memory-based text correction for preposition and determiner errors

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the IGTree learning algorithm, which compresses aninstance base into a tree structure. The concept of information gainis used as a heuristic function for performing this compression.IGTree produces trees that, compared to other lazy learningapproaches, reduce storage requirements and the time required tocompute classifications. Furthermore, we obtained similar or bettergeneralization accuracy with IGTree when trained on two complexlinguistic tasks, viz. letter–phoneme transliteration andpart-of-speech-tagging, when compared to alternative lazy learning anddecision tree approaches (viz., IB1, information-gain-weighted IB1,and C4.5). A third experiment, with the task of word hyphenation,demonstrates that when the mutual differences in information gain offeatures is too small, IGTree as well as information-gain-weighted IB1perform worse than IB1. These results indicate that IGTree is auseful algorithm for problems characterized by the availability of alarge number of training instances described by symbolic featureswith sufficiently differing information gain values.