Nonparametric Bayesian word sense induction

Authors:
Xuchen Yao;Benjamin Van Durme
Affiliations:
Johns Hopkins University;Johns Hopkins University
Venue:
TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
Year:
2011

Citing 16
Cited 1

Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images

Readings in uncertain reasoning
The British national corpus

The digital word
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Discovering corpus-specific word senses

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Bayesian word sense induction

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
OntoNotes: the 90% solution

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Semeval-2007 task 02: evaluating word sense induction and discrimination systems

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
I2R: three systems for word sense discrimination, Chinese word sense disambiguation, and English word sense disambiguation

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
UMND2: SenseClusters applied to the sense induction task of Senseval-4

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Unsupervised and constrained Dirichlet process mixture models for verb clustering

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Topic models for word sense disambiguation and token-based idiom detection

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Word sense induction & disambiguation using hierarchical random graphs

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A mixture model with sharing for lexical semantics

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

Word sense induction for novel sense detection

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose the use of a nonparametric Bayesian model, the Hierarchical Dirichlet Process (HDP), for the task of Word Sense Induction. Results are shown through comparison against Latent Dirichlet Allocation (LDA), a parametric Bayesian model employed by Brody and Lapata (2009) for this task. We find that the two models achieve similar levels of induction quality, while the HDP confers the advantage of automatically inducing a variable number of senses per word, as compared to manually fixing the number of senses a priori, as in LDA. This flexibility allows for the model to adapt to terms with greater or lesser polysemy, when evidenced by corpus distributional statistics. When trained on out-of-domain data, experimental results confirm the model's ability to make use of a restricted set of topically coherent induced senses, when then applied in a restricted domain.