Clustering Algorithms
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Automatic cluster stopping with criterion functions and the gap statistic
NAACL-Demonstrations '06 Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume: demonstrations
Methodological Review: Empirical distributional semantics: Methods and biomedical applications
Journal of Biomedical Informatics
Selecting the "right" number of senses based on clustering criterion functions
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Evaluation of utility of LSA for word sense discrimination
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
How latent is latent semantic analysis?
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Significant lexical relationships
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Name discrimination by clustering similar contexts
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Using second-order vectors in a knowledge-based method for acronym disambiguation
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text
Journal of Biomedical Informatics
Hi-index | 0.00 |
Unsupervised word sense discrimination relies on the idea that words that occur in similar contexts will have similar meanings. These techniques cluster multiple contexts in which an ambiguous word occurs, and the number of clusters discovered indicates the number of senses in which the ambiguous word is used. One important distinction among these methods is the underlying means of representing the contexts to be clustered. This paper compares the efficacy of first-order methods that directly represent the features that occur in a context with several second-order methods that use a more indirect representation. The experiments in this paper show that second order methods that use word by word co-occurrence matrices result in the highest accuracy and most robust word sense discrimination. These experiments were conducted on MedLine abstracts that contained pseudo--words created by conflating together pairs of MeSH preferred terms to create new ambiguous words. The experiments were carried out with SenseClusters, a freely available open source software package.