Tor, TorMd: distributional profiles of concepts for unsupervised word sense disambiguation

Authors:
Saif Mohammad;Graeme Hirst;Philip Resnik
Affiliations:
University of Toronto, Toronto, ON, Canada;University of Toronto, Toronto, ON, Canada;University of Maryland, College Park, MD
Venue:
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Year:
2007

Citing 8
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Distributional measures of concept-distance: a task-oriented evaluation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
OntoNotes: the 90% solution

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
SemEval-2007 task 05: multilingual Chinese-English lexical sample

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2007 task 10: English lexical substitution task

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2007 task 17: English lexical sample, SRL and all words

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations

Quantified Score

Hi-index	0.00

Visualization

Abstract

Words in the context of a target word have long been used as features by supervised word-sense classifiers. Mohammad and Hirst (2006a) proposed a way to determine the strength of association between a sense or concept and co-occurring words---the distributional profile of a concept (DPC)---without the use of manually annotated data. We implemented an unsupervised naïve Bayes word sense classifier using these DPCs that was best or within one percentage point of the best unsupervised systems in the Multilingual Chinese-English Lexical Sample Task (task #5) and the English Lexical Sample Task (task #17). We also created a simple PMI-based classifier to attempt the English Lexical Substitution Task (task #10); however, its performance was poor.