Forgetting Exceptions is Harmful in Language Learning

Authors:
Walter Daelemans;Antal Van Den Bosch;Jakub Zavrel
Affiliations:
ILK / Computational Linguistics, Tilburg University, P.O. Box 90153, NL-5000 LE Tilburg, The Netherlands. walter@kub.nl;ILK / Computational Linguistics, Tilburg University, P.O. Box 90153, NL-5000 LE Tilburg, The Netherlands. antalb@kub.nl;ILK / Computational Linguistics, Tilburg University, P.O. Box 90153, NL-5000 LE Tilburg, The Netherlands. zavrel@kub.nl
Venue:
Machine Learning - Special issue on natural language learning
Year:
1999

Citing 32
Cited 79

Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
An application of the multiedit-condensing technique to the reference selection problem in a print recognition system

Pattern Recognition
Constructing a generalizer superior to NETtalk via a mathematical theory of generalization

Neural Networks
Instance-Based Learning Algorithms

Machine Learning
Improved Estimates for the Accuracy of Small Disjuncts

Machine Learning
Symbolic and Neural Learning Algorithms: An Experimental Comparison

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Machine Learning
Case-based reasoning

Case-based reasoning
Natural language parsing as statistical pattern recognition

Natural language parsing as statistical pattern recognition
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
A Comparison of ID3 and Backpropagation for English Text-To-Speech Mapping

Machine Learning
Domain-specific knowledge acquisition for conceptual sentence analysis

Domain-specific knowledge acquisition for conceptual sentence analysis
Analogical natural language processing

Analogical natural language processing
Unifying instance-based and rule-based induction

Machine Learning
Locally Weighted Learning

Artificial Intelligence Review - Special issue on lazy learning
IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms

Artificial Intelligence Review - Special issue on lazy learning
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Learning with Nested Generalized Exemplars

Learning with Nested Generalized Exemplars
Editorial

Artificial Intelligence Review - Special issue on lazy learning
Induction of Decision Trees

Machine Learning
Generalizing from Case studies: A Case Study

ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Selecting Typical Instances in Instance-Based Learning

ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Memory-Based Lexical Acquisition and Processing

Proceedings of the Third International EAMT Workshop on Machine Translation and the Lexicon
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Similarity-based methods for word sense disambiguation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Memory-based learning: using similarity for smoothing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A maximum entropy model for prepositional phrase attachment

HLT '94 Proceedings of the workshop on Human Language Technology
Induction of first-order decision lists: results on learning the past tense of English verbs

Journal of Artificial Intelligence Research
Lazy decision trees

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

A Cognitive Bias Approach to Feature Selection and Weighting for Case-Based Learners

Machine Learning
Machine learning and inductive logic programming for multi-agent systems

Mutli-agents systems and applications
Advances in Instance Selection for Instance-Based Learning Algorithms

Data Mining and Knowledge Discovery
An Approach to Improve Text Classification Efficiency

ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Machine Learning and Inductive Logic Programming for Multi-agent Systems

EASSS '01 Selected Tutorial Papers from the 9th ECCAI Advanced Course ACAI 2001 and Agent Link's 3rd European Agent Systems Summer School on Multi-Agent Systems and Applications
Feature-Based WSD: Why We Are at a Dead-End

PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing
Diacritics Restoration: Learning from Letters versus Learning from Words

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
A probabilistic account of logical metonymy

Computational Linguistics
Text Pattern Visualization for analysis of biology full text and captions

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Memory-based shallow parsing

The Journal of Machine Learning Research
Shallow parsing with pos taggers and linguistic features

The Journal of Machine Learning Research
Learning rules and their exceptions

The Journal of Machine Learning Research
Shallow parsing using noisy and non-stationary training material

The Journal of Machine Learning Research
Parameter optimization for machine-learning of word sense disambiguation

Natural Language Engineering
Word sense disambiguation with pattern learning and automatic feature selection

Natural Language Engineering
Complex answers: a case study using a WWW question answering system

Natural Language Engineering
Noun phrase recognition by system combination

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Representing text chunks

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Applying system combination to base noun phrase identification

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Man vs. machine: a case study in base noun phrase learning

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Investigating GIS and smoothing for maximum entropy taggers

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Learning PP attachment for filtering prosodic phrasing

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
TüSBL: a similarity-based chunk parser for robust syntactic processing

HLT '01 Proceedings of the first international conference on Human language technology research
Instance based learning with automatic feature selection applied to word sense disambiguation

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
From chunks to function-argument structure: a similarity-based approach

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Shallow parsing on the basis of words only: a case study

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning to predict pitch accents and prosodic boundaries in Dutch

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Text chunking by combining hand-crafted rules and memory-based learning

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Rule writing or annotation: cost-efficient resource usage for base noun phrase chunking

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
The emergence of compositional structures in perceptually grounded language games

Artificial Intelligence - Special volume on connecting language to the world
Learning in natural language: theory and algorithmic approaches

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
The role of algorithm bias vs information source in learning algorithms for Morphosyntactic Disambiguation

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A comparison between supervised learning algorithms for word sense disambiguation

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Memory-based learning for article generation

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Using induced rules as complex features in memory-based language learning

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Genetic algorithms for feature relevance assignment in memory-based language processing

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A default first order family weight determination procedure for WPDV models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
An empirical study of the domain dependence of supervised word sense disambiguation systems

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Combining a self-organising map with memory-based learning

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Learning computational grammars

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Dutch word sense disambiguation: optimizing the localness of context

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Distinguishing easy and hard instances

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Letter level learning for language independent diacritics restoration

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Exceptionality and natural language learning

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Deterministic dependency parsing of English text

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Information fusion approaches to the automatic pronunciation of print by analogy

Information Fusion
Can syllabification improve pronunciation by analogy of English?

Natural Language Engineering
Multi-level information and automatic dialog act detection in human-human spoken dialogs

Speech Communication
Pronunciation prediction with Default&Refine

Computer Speech and Language
Boosted decision graphs for NLP learning tasks

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
The effect of borderline examples on language learning

Journal of Experimental & Theoretical Artificial Intelligence
Multilingual pronunciation by analogy

Natural Language Engineering
An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments)

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A memory-based learning approach to event extraction in biomedical texts

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
A nearest-neighbor approach to the automatic analysis of ancient Greek morphology

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
A combined memory-based semantic role labeler of English

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Learning the scope of negation in biomedical texts

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Memory-based morphological analysis generation and part-of-speech tagging of Arabic

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
All-word prediction as the ultimate confusable disambiguation

CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Classification of semantic relations by humans and machines

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Towards case-based parsing: are chunks reliable indicators for syntax trees?

LD '06 Proceedings of the Workshop on Linguistic Distances
The emergence of compositional structures in perceptually grounded language games

Artificial Intelligence - Special volume on connecting language to the world
An analogical learner for morphological analysis

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Evaluating hybrid versus data-driven coreference resolution

DAARC'07 Proceedings of the 6th discourse anaphora and anaphor resolution conference on Anaphora: analysis, algorithms and applications
Evolutionary computing as a tool for grammar development

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartI
The role of PP attachment in preposition generation

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Exemplar-based models for word meaning in context

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Enhanced suffix arrays as language models: virtual k-testable languages

ICGI'10 Proceedings of the 10th international colloquium conference on Grammatical inference: theoretical results and applications
Automatic conversion between pronunciations of different English accents

Speech Communication
Automatic Detection of Arabic Non-Anaphoric Pronouns for Improving Anaphora Resolution

ACM Transactions on Asian Language Information Processing (TALIP)
Semisupervised condensed nearest neighbor for part-of-speech tagging

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Customisable semantic analysis of texts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Direct posterior confidence for out-of-vocabulary spoken term detection

ACM Transactions on Information Systems (TOIS)
Pattern learning and active feature selection for word sense disambiguation

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Predicting learner levels for online exercises of Hebrew

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Part of speech tagging for arabic

Natural Language Engineering
Ontology-Based word sense disambiguation for scientific literature

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Building instance knowledge network for word sense disambiguation

ACSC '11 Proceedings of the Thirty-Fourth Australasian Computer Science Conference - Volume 113

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that in language learning, contrary to receivedwisdom, keeping exceptional training instances in memory can bebeneficial for generalization accuracy. We investigate this phenomenonempirically on a selection of benchmark natural language processingtasks: grapheme-to-phoneme conversion, part-of-speech tagging,prepositional-phrase attachment, and base noun phrase chunking. In afirst series of experiments we combine memory-based learning withtraining set editing techniques, in which instances are edited basedon their typicality and class prediction strength. Results show thatediting exceptional instances (with low typicality or low classprediction strength) tends to harm generalization accuracy. In asecond series of experiments we compare memory-based learning anddecision-tree learning methods on the same selection of tasks, andfind that decision-tree learning often performs worse thanmemory-based learning. Moreover, the decrease in performance can belinked to the degree of abstraction from exceptions (i.e., pruning oreagerness). We provide explanations for both results in terms of theproperties of the natural language processing tasks and the learningalgorithms.