PUTOP: turning predominant senses into a topic model for word sense disambiguation

Authors:
Jordan Boyd-Graber;David Blei
Affiliations:
Princeton University, Princeton, NJ;Princeton University, Princeton, NJ
Venue:
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Year:
2007

Citing 7
Cited 3

An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Latent dirichlet allocation

The Journal of Machine Learning Research
A semantic concordance

HLT '93 Proceedings of the workshop on Human Language Technology
Understanding the Yarowsky Algorithm

Computational Linguistics
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
SemEval-2007 task 01: evaluating WSD on cross-language information retrieval

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2007 task 17: English lexical sample, SRL and all words

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations

Bayesian word sense induction

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Topic models for word sense disambiguation and token-based idiom detection

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Semantic topic models: combining word distributional statistics and dictionary definitions

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We extend on McCarthy et al.'s predominant sense method to create an unsupervised method of word sense disambiguation that uses automatically derived topics using Latent Dirichlet allocation. Using topic-specific synset similarity measures, we create predictions for each word in each document using only word frequency information. It is hoped that this procedure can improve upon the method for larger numbers of topics by providing more relevant training corpora for the individual topics. This method is evaluated on SemEval-2007 Task 1 and Task 17.