Verb sense disambiguation based on dual distributional similarity

Authors:
Jeong-Mi Cho;Jungyun Seo;Gil Chang Kim
Affiliations:
Human & Computer Interaction Lab. Information Processing Sector, Samsung Advanced Institute of Technology, P.O. Box 111, Suwon 440–/600, Korea/ e-mail: jmcho@sait.samsung.co.kr;Department of Computer Science, Sogang University, Sinsu-dong, Mapo-gu, Seoul 121-742, Korea/ e-mail: jyseo@ccs.sogang.ac.kr;Department of Computer Science, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong, Yusong-gu, Taejon 305-701, Korea/ e-mail: gckim@csking.kaist.ac.kr
Venue:
Natural Language Engineering
Year:
1999

Citing 19
Cited 0

Word sense disambiguation using machine-readable dictionaries

SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Class-based n-gram models of natural language

Computational Linguistics
Evaluation techniques for automatic semantic extraction: comparing syntactic and window based approaches

Corpus processing for lexical acquisition
Determining similarity and inferring relations in a lexical knowledge base

Determining similarity and inferring relations in a lexical knowledge base
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Introduction to the special issue on computational linguistics using large corpora

Computational Linguistics - Special issue on using large corpora: I
Similarity-based word sense disambiguation

Computational Linguistics - Special issue on word sense disambiguation
Similarity between words computed by spreading activation on an English dictionary

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Similarity-based methods for word sense disambiguation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Subject-dependent co-occurrence and word sense disambiguation

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Similarity-based estimation of word cooccurrence probabilities

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Genus disambiguation: a study in weighted preference

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4
Word sense disambiguation with very large neural networks extracted from machine readable dictionaries

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Smoothing of automatically generated selectional constraints

HLT '93 Proceedings of the workshop on Human Language Technology
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a system for automatic verb sense disambiguation using a small corpus and a Machine-Readable Dictionary (MRD) in Korean. The system learns a set of typical uses listed in the MRD usage examples for each of the senses of a polysemous verb in the MRD definitions using verb-object co-occurrences acquired from the corpus. This paper concentrates on the problem of data sparseness in two ways. First, by extending word similarity measures from direct co-occurrences to co-occurrences of co-occurring words, we compute the word similarities using non co-occurring words but co-occurring clusters. Secondly, we acquire IS-A relations of nouns from the MRD definitions. It is possible to roughly cluster the nouns by the identification of the IS-A relationship. Using these methods, two words may be considered similar even if they do not share any word elements. Experiments show that this method can learn from a very small training corpus, achieving over an 86% correct disambiguation performance without any restriction on a word's senses.