Estimating upper and lower bounds on the performance of word-sense disambiguation programs

Authors:
William Gale;Kenneth Ward Church;David Yarowsky
Affiliations:
AT& T Bell Laboratories, Murray Hill, NJ;AT& T Bell Laboratories, Murray Hill, NJ;AT& T Bell Laboratories, Murray Hill, NJ
Venue:
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Year:
1992

Citing 12
Cited 46

Semantic interpretation and the resolution of ambiguity

Semantic interpretation and the resolution of ambiguity
An experiment in computational discrimination of English word senses

IBM Journal of Research and Development
Automatic text processing

Automatic text processing
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Information Retrieval

Information Retrieval
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Two languages are more informative than one

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical methods

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Word sense disambiguation with very large neural networks extracted from machine readable dictionaries

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
One sense per discourse

HLT '91 Proceedings of the workshop on Speech and Natural Language

Word sense disambiguation and information retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Principled disambiguation: discriminating adjective senses with modified nouns

Computational Linguistics
The impact on retrieval effectiveness of skewed frequency distributions

ACM Transactions on Information Systems (TOIS)
Retrieving with Good Sense

Information Retrieval
A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Word sense disambiguation in information retrieval revisited

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The interaction of knowledge sources in word sense disambiguation

Computational Linguistics
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Discourse segmentation by human and automated means

Computational Linguistics
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Selective sampling for example-based word sense disambiguation

Computational Linguistics
Dedication to William A. Gale

Natural Language Engineering
Introduction to the special issue on evaluating word sense disambiguation systems

Natural Language Engineering
95% Replicability for manual word sense tagging

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
PARADISE: a framework for evaluating spoken dialogue agents

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Text segmentation with multiple surface linguistic cues

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
An IR approach for translating new words from nonparallel, comparable texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Intention-based segmentation: human reliability and correlation with linguistic cues

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Combining multiple knowledge sources for discourse segmentation

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Sense disambiguation using semantic relations and adjacency information

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A "not-so-shallow" parser for collocational analysis

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
To what extent does case contribute to verb sense disambiguation?

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Verb class disambiguation using informative priors

Computational Linguistics
Automatic discourse structure detection using shallow textual continuity

International Journal of Human-Computer Studies
Mixed language query disambiguation

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Translation Disambiguation in Mixed Language Queries

Machine Translation
A novel approach to semantic indexing based on concept

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Using a semantic concordance for sense identification

HLT '94 Proceedings of the workshop on Human Language Technology
Automatic WordNet mapping using word sense disambiguation

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
An empirical study of the domain dependence of supervised word sense disambiguation systems

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
An equivalent pseudoword solution to Chinese word sense disambiguation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned wordnets

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Ambiguous queries: test collections need more sense

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Word Sense Disambiguation of Farsi Homographs Using Thesaurus and Corpus

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
A New Decision Rule for Statistical Word Sense Disambiguation

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
A Vicarious Words Method for Word Sense Discrimination

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
Word sense disambiguation: A survey

ACM Computing Surveys (CSUR)
SemEval-2010 task 3: cross-lingual word sense disambiguation

DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
Performance analysis of a part of speech tagging task

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Disambiguation in the biomedical domain: The role of ambiguity type

Journal of Biomedical Informatics
Implicit association via crowd-sourced coselection

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
An experimental study on unsupervised graph-based word sense disambiguation

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Segmentation similarity and agreement

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have recently reported on two new word-sense disambiguation systems, one trained on bilingual material (the Canadian Hansards) and the other trained on monolingual material (Roget's Thesaurus and Grolier's Encyclopedia). After using both the monolingual and bilingual classifiers for a few months, we have convinced ourselves that the performance is remarkably good. Nevertheless, we would really like to be able to make a stronger statement, and therefore, we decided to try to develop some more objective evaluation measures. Although there has been a fair amount of literature on sense-disambiguation, the literature does not offer much guidance in how we might establish the success or failure of a proposed solution such as the two systems mentioned in the previous paragraph. Many papers avoid quantitative evaluations altogether, because it is so difficult to come up with credible estimates of performance.This paper will attempt to establish upper and lower bounds on the level of performance that can be expected in an evaluation. An estimate of the lower bound of 75% (averaged over ambiguous types) is obtained by measuring the performance produced by a baseline system that ignores context and simply assigns the most likely sense in all cases. An estimate of the upper bound is obtained by assuming that our ability to measure performance is largely limited by our ability obtain reliable judgments from human informants. Not surprisingly, the upper bound is very dependent on the instructions given to the judges. Jorgensen, for example, suspected that lexicographers tend to depend too much on judgments by a single informant and found considerable variation over judgments (only 68% agreement), as she had suspected. In our own experiments, we have set out to find word-sense disambiguation tasks where the judges can agree often enough so that we could show that they were outperforming the baseline system. Under quite different conditions, we have found 96.8% agreement over judges.