Verb sense disambiguation based on dual distributional similarity

  • Authors:
  • Jeong-Mi Cho;Jungyun Seo;Gil Chang Kim

  • Affiliations:
  • Human & Computer Interaction Lab. Information Processing Sector, Samsung Advanced Institute of Technology, P.O. Box 111, Suwon 440–/600, Korea/ e-mail: jmcho@sait.samsung.co.kr;Department of Computer Science, Sogang University, Sinsu-dong, Mapo-gu, Seoul 121-742, Korea/ e-mail: jyseo@ccs.sogang.ac.kr;Department of Computer Science, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong, Yusong-gu, Taejon 305-701, Korea/ e-mail: gckim@csking.kaist.ac.kr

  • Venue:
  • Natural Language Engineering
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a system for automatic verb sense disambiguation using a small corpus and a Machine-Readable Dictionary (MRD) in Korean. The system learns a set of typical uses listed in the MRD usage examples for each of the senses of a polysemous verb in the MRD definitions using verb-object co-occurrences acquired from the corpus. This paper concentrates on the problem of data sparseness in two ways. First, by extending word similarity measures from direct co-occurrences to co-occurrences of co-occurring words, we compute the word similarities using non co-occurring words but co-occurring clusters. Secondly, we acquire IS-A relations of nouns from the MRD definitions. It is possible to roughly cluster the nouns by the identification of the IS-A relationship. Using these methods, two words may be considered similar even if they do not share any word elements. Experiments show that this method can learn from a very small training corpus, achieving over an 86% correct disambiguation performance without any restriction on a word's senses.