Using the mutual k-nearest neighbor graphs for semi-supervised classification of natural language data

Authors:
Kohei Ozaki;Masashi Shimbo;Mamoru Komachi;Yuji Matsumoto
Affiliations:
Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Takayama, Ikoma, Nara, Japan
Venue:
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Year:
2011

Citing 17
Cited 0

Fibonacci heaps and their uses in improved network optimization algorithms

Journal of the ACM (JACM)
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Corpus-based statistical sense resolution

HLT '93 Proceedings of the workshop on Human Language Technology
Semi-supervised learning with graphs

Semi-supervised learning with graphs
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Cover trees for nearest neighbor

ICML '06 Proceedings of the 23rd international conference on Machine learning
Word sense disambiguation using label propagation based semi-supervised learning

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Graph transduction via alternating minimization

Proceedings of the 25th international conference on Machine learning
Semi-supervised Classification from Discriminative Random Walks

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters

Theoretical Computer Science
Graph construction and b-matching for semi-supervised learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Graph-based learning for statistical machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Seeing stars when there aren't many stars: graph-based semi-supervised learning for sentiment categorization

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Multi-class confidence weighted algorithms

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection

The Journal of Machine Learning Research
Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The first step in graph-based semi-supervised classification is to construct a graph from input data. While the k-nearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less well-known mutual k-nearest neighbor graphs for high-dimensional natural language data. To compare the performance of these two graph construction methods, we run semi-supervised classification methods on both graphs in word sense disambiguation and document classification tasks. The experimental results show that the mutual k-nearest neighbor graphs, if combined with maximum spanning trees, consistently outperform the k-nearest neighbor graphs. We attribute better performance of the mutual k-nearest neighbor graph to its being more resistive to making hub vertices. The mutual k-nearest neighbor graphs also perform equally well or even better in comparison to the state-of-the-art b-matching graph construction, despite their lower computational complexity.