Improving word sense disambiguation by pseudo-samples

Authors:
Xiaojie Wang;Yuji Matsumoto
Affiliations:
Graduate School of Information Science, Nara Institute of Science and Technology, Nara, Japan;Graduate School of Information Science, Nara Institute of Science and Technology, Nara, Japan
Venue:
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Year:
2004

Citing 9
Cited 0

An automatic method for generating sense tagged corpora

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Similarity-based word sense disambiguation

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
An unsupervised method for word sense tagging using parallel corpora

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Word translation disambiguation using Bilingual Bootstrapping

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Exploring automatic word sense disambiguation with decision lists and the web

Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data sparseness is a major problem in word sense disambiguation. Automatic sample acquisition and smoothing are two ways that have been explored to alleviate the influence of data sparseness. In this paper, we consider a combination of these two methods. Firstly, we propose a pattern-based way to acquire pseudo samples, and then we estimate conditional probabilities for variables by combining pseudo data set with sense tagged data set. By using the combinational estimation, we build an appropriate leverage between the two different data sets, which is vital to achieve the best performance. Experiments show that our approach brings significant improvement for Chinese word sense disambiguation.