Bayesian word sense induction

Authors:
Samuel Brody;Mirella Lapata
Affiliations:
Columbia University;University of Edinburgh
Venue:
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Year:
2009

Citing 17
Cited 31

Using WordNet to disambiguate word senses for text retrieval

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Latent dirichlet allocation

The Journal of Machine Learning Research
Matching words and pictures

The Journal of Machine Learning Research
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Discovering corpus-specific word senses

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Word sense disambiguation vs. statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Word-sense disambiguation for machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
OntoNotes: the 90% solution

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Semeval-2007 task 02: evaluating word sense induction and discrimination systems

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
I2R: three systems for word sense discrimination, Chinese word sense disambiguation, and English word sense disambiguation

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
NUS-ML: improving word sense disambiguation using topic features

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
PUTOP: turning predominant senses into a topic model for word sense disambiguation

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
UMND2: SenseClusters applied to the sense induction task of Senseval-4

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

Automatic evaluation of topic coherence

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A latent dirichlet allocation method for selectional preferences

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Latent variable models of selectional preference

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Topic models for word sense disambiguation and token-based idiom detection

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Long distance bigram models applied to word clustering

Pattern Recognition
Expectation vectors: a semiotics inspired approach to geometric lexical-semantic representation

GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Word sense induction & disambiguation using hierarchical random graphs

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Topic models for meaning similarity in context

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Best topic word selection for topic labelling

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Instance sense induction from attribute sets

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Automatic labelling of topic models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Towards tracking semantic change by visual analytics

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Nonparametric Bayesian word sense induction

TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
Latent topic models of surface syntactic information

AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Knowledge-based and knowledge-lean methods combined in unsupervised word sense disambiguation

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Semantic topic models: combining word distributional statistics and dictionary definitions

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised learning of selectional restrictions and detection of argument coercions

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Probabilistic models of similarity in syntactic context

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A quick tour of word sense disambiguation, induction and related approaches

SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
Word sense induction for novel sense detection

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Expectations of word sense in parallel corpora

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Lexical semantics and distribution of suffixes: a visual analysis

EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Looking at word meaning: an interactive visualization of semantic vector spaces for Dutch synsets

EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Evaluating unsupervised ensembles when applied to word sense induction

ACL '12 Proceedings of ACL 2012 Student Research Workshop
Exploring topic coherence over many models and many topics

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
MaxMax: a graph-based soft clustering algorithm applied to word sense induction

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
An inference-based model of word meaning in context as a paraphrase distribution

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
On collocations and topic models

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Sense induction in folksonomies: a review

Artificial Intelligence Review
Evaluating Word Sense Induction and Disambiguation Methods

Language Resources and Evaluation
Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sense induction seeks to automatically identify word senses directly from a corpus. A key assumption underlying previous work is that the context surrounding an ambiguous word is indicative of its meaning. Sense induction is thus typically viewed as an unsupervised clustering problem where the aim is to partition a word's contexts into different classes, each representing a word sense. Our work places sense induction in a Bayesian context by modeling the contexts of the ambiguous word as samples from a multinomial distribution over senses which are in turn characterized as distributions over words. The Bayesian framework provides a principled way to incorporate a wide range of features beyond lexical co-occurrences and to systematically assess their utility on the sense induction task. The proposed approach yields improvements over state-of-the-art systems on a benchmark dataset.