Class-based n-gram models of natural language
Computational Linguistics
Word sense disambiguation using a second language monolingual corpus
Computational Linguistics
Algorithms for bigram and trigram word clustering
Speech Communication
Word sense disambiguation in information retrieval revisited
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Word clustering and disambiguation based on co-occurrence data
Natural Language Engineering
A simple approach to building ensembles of Naive Bayesian classifiers for word sense disambiguation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
An efficient method for determining bilingual word classes
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Information retrieval using word senses: root sense tagging approach
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Translation selection through source word sense disambiguation and target word selection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Simple features for Chinese word sense disambiguation
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Maximum entropy models for word sense disambiguation
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
HLT '93 Proceedings of the workshop on Human Language Technology
Hi-index | 0.00 |
The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with relatively high precision. How to improve the recall without decrease the precision? In this paper, we investigate a word-class approach to extend the collocation list which is constructed from the manually sense-tagged corpus. But the word classes are obtained from a larger scale corpus which is not sense tagged. The experiment results have shown that the F-measure is improved to 71% compared to 54% of the baseline system where the word-class is not considered, although the precision decreases slightly. Further study discovers the relationship between the F-measure and the number of word-class trained from the various sizes of corpus.