Improving Unsupervised WSD with a Dynamic Thesaurus

Authors:
Javier Tejada-Cárcamo;Hiram Calvo;Alexander Gelbukh
Affiliations:
Center for Computing Research, National Polytechnic Institute, Mexico City, México 07738 and Sociedad Peruana de Computación, Arequipa, Perú;Center for Computing Research, National Polytechnic Institute, Mexico City, México 07738;Center for Computing Research, National Polytechnic Institute, Mexico City, México 07738
Venue:
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Year:
2008

Citing 5
Cited 2

Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing

Learning Co-relations of Plausible Verb Arguments with a WSM and a Distributional Thesaurus

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Unsupervised WSD by finding the predominant sense using context as a dynamic thesaurus

Journal of Computer Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.