Foundations of statistical natural language processing
Foundations of statistical natural language processing
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovering word senses from text
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Relationship-Based Clustering and Visualization for High-Dimensional Data Mining
INFORMS Journal on Computing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Clustering by committee
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Building a sense tagged corpus with open mind word expert
WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Introduction to Information Retrieval
Introduction to Information Retrieval
Word sense disambiguation: A survey
ACM Computing Surveys (CSUR)
Inter-coder agreement for computational linguistics
Computational Linguistics
A comparison of extrinsic clustering evaluation metrics based on formal constraints
Information Retrieval
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Semeval-2007 task 02: evaluating word sense induction and discrimination systems
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2007 task 17: English lexical sample, SRL and all words
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2010 task 14: evaluation setting for word sense induction & disambiguation systems
DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions
Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
The design, implementation, and use of the Ngram statistics package
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
The WSD development environment
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Hi-index | 0.00 |
Word sense disambiguation in text is still a difficult problem as the best supervised methods require laborious and costly preparation of training data. This work focuses on evaluation of a few selected clustering algorithms in the task of word sense disambiguation. We used five datasets for two languages (English and Polish). Five clustering algorithms (k-means, k-medoids, hierarchical agglomerative clustering, hierarchical divisive clustering, graph-partitioning-based clustering) and two weighting schemes were tested. The best parameters of the algorithms were chosen using 5 × 2 cross validation. BCubed measure was employed for evaluation of clustering. We conclude that with these settings agglomerative hierarchical clustering achieves best results for all the datasets.