Semi-supervised Clustering for Word Instances and Its Effect on Word Sense Disambiguation

  • Authors:
  • Kazunari Sugiyama;Manabu Okumura

  • Affiliations:
  • Precision and Intelligence Laboratory, Tokyo Institute of Technology, Kanagawa, Japan 226-8503;Precision and Intelligence Laboratory, Tokyo Institute of Technology, Kanagawa, Japan 226-8503

  • Venue:
  • CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a supervised word sense disambiguation (WSD) system that uses features obtained from clustering results of word instances. Our approach is novel in that we employ semi-supervised clustering that controls the fluctuation of the centroid of a cluster, and we select seed instances by considering the frequency distribution of word senses and exclude outliers when we introduce "must-link" constraints between seed instances. In addition, we improve the supervised WSD accuracy by using features computed from word instances in clusters generated by the semi-supervised clustering. Experimental results show that these features are effective in improving WSD accuracy.