Using clustering analysis to improve semi-supervised classification

  • Authors:
  • Haitao Gan;Nong Sang;Rui Huang;Xiaojun Tong;Zhiping Dan

  • Affiliations:
  • Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China;Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China;Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China;College of Mathematics and Computer Science, Wuhan Textile University, Wuhan 430073, China;Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China

  • Venue:
  • Neurocomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

Semi-supervised classification has become an active topic recently and a number of algorithms, such as Self-training, have been proposed to improve the performance of supervised classification using unlabeled data. In this paper, we propose a semi-supervised learning framework which combines clustering and classification. Our motivation is that clustering analysis is a powerful knowledge-discovery tool and it may reveal the underlying data space structure from unlabeled data. In our framework, semi-supervised clustering is integrated into Self-training classification to help train a better classifier. In particular, the semi-supervised fuzzy c-means algorithm and support vector machines are used for clustering and classification, respectively. Experimental results on artificial and real datasets demonstrate the advantages of the proposed framework.