A semi-supervised feature clustering algorithm with application to word sense disambiguation

  • Authors:
  • Zheng-Yu Niu;Dong-Hong Ji;Chew Lim Tan

  • Affiliations:
  • Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;National University of Singapore, Singapore

  • Venue:
  • HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we investigate an application of feature clustering for word sense disambiguation, and propose a semisupervised feature clustering algorithm. Compared with other feature clustering methods (ex. supervised feature clustering), it can infer the distribution of class labels over (unseen) features unavailable in training data (labeled data) by the use of the distribution of class labels over (seen) features available in training data. Thus, it can deal with both seen and unseen features in feature clustering process. Our experimental results show that feature clustering can aggressively reduce the dimensionality of feature space, while still maintaining state of the art sense disambiguation accuracy. Furthermore, when combined with a semi-supervised WSD algorithm, semi-supervised feature clustering outperforms other dimensionality reduction techniques, which indicates that using unlabeled data in learning process helps to improve the performance of feature clustering and sense disambiguation.