Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language data

  • Authors:
  • Kohei Ozaki;Masashi Shimbo;Mamoru Komachi;Yuji Matsumoto

  • Affiliations:
  • Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan

  • Venue:
  • CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The first step in graph-based semi-supervised classification is to construct a graph from input data. While the k-nearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less well-known mutual k-nearest neighbor graphs for high-dimensional natural language data. To compare the performance of these two graph construction methods, we run semi-supervised classification methods on both graphs in word sense disambiguation and document classification tasks. The experimental results show that the mutual k-nearest neighbor graphs, if combined with maximum spanning trees, consistently outperform the k-nearest neighbor graphs. We attribute better performance of the mutual k-nearest neighbor graph to its being more resistive to making hub vertices. The mutual k-nearest neighbor graphs also perform equally well or even better in comparison to the state-of-the-art b-matching graph construction, despite their lower computational complexity.