A new supervised learning algorithm for word sense disambiguation

Authors:
Ted Pedersen;Rebecca Bruce
Affiliations:
Department of Computer Science and Engineering, Southern Methodist University, Dallas, TX;Department of Computer Science and Engineering, Southern Methodist University, Dallas, TX
Venue:
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Year:
1997

Citing 12
Cited 9

C4.5: programs for machine learning

C4.5: programs for machine learning
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Machine Learning
Recursive Automatic Bias Selection for Classifier Construction

Machine Learning - Special issue on bias evaluation and selection
Unifying instance-based and rule-based induction

Machine Learning
A maximum entropy approach to natural language processing

Computational Linguistics
The CN2 Induction Algorithm

Machine Learning
Sequential model selection for word sense disambiguation

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Corpus-based statistical sense resolution

HLT '93 Proceedings of the workshop on Human Language Technology
Graphical Models in Applied Multivariate Statistics

Graphical Models in Applied Multivariate Statistics
Significant lexical relationships

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Lexical Semantic Ambiguity Resolution with Bigram-Based Decision Trees

CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
A Baseline Methodology for Word Sense Disambiguation

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Feature lattices for maximum entropy modelling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A decision tree of bigrams is an accurate predictor of word sense

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Lexical Constellations and the Structure of Meaning: A Prototype Application to WSD

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Knowledge lean word sense disambiguation

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Naive mixes for word sense disambiguation

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Naive Mix is a new supervised learning algorithm that is based on a sequential method for selecting probabilistic models. The usual objective of model selection is to find a single model that adequately characterizes the data in a training sample. However, during model selection a sequence of models is generated that consists of the best-fitting model at each level of model complexity. The Naive Mix utilizes this sequence of models to define a probabilistic model which is then used as a probabilistic classifier to perform word-sense disambiguation. The models in this sequence are restricted to the class of decomposable log-linear models. This class of models offers a number of computational advantages. Experiments disambiguating twelve different words show that a Naive Mix formulated with a forward sequential search and Akaike's Information Criteria rivals established supervised learning algorithms such as decision trees (C4.5), rule induction (CN2) and nearest-neighbor classification (PEBLS).