Semi-supervised clustering ensemble based on collaborative training

  • Authors:
  • Jinyuan Zhang;Yan Yang;Hongjun Wang;Amjad Mahmood;Feifei Huang

  • Affiliations:
  • School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China

  • Venue:
  • RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent researches on data clustering is increasingly focusing on combining multiple data partitions as a way to improve the robustness of clustering solutions. Most of them focused on crisp clustering combination. Semi-supervised clustering uses a small amount of labeled data to aid and bias the clustering of unlabeled data. However, in this paper, we offer a semi-supervised clustering ensemble model based on collaborative training (SCET) and an unsupervised clustering ensemble mode based on collaborative training (UCET). In the ensemble step of SCET, semi-supervised learning is introduced. While in UCET, the knowledge used in SCET is replaced by information extracted from the base-clusterings. Then tri-training is used as consensus of clustering ensemble. The experiments on datasets from UCI machine learning repository indicate that the model improves the accuracy of clustering.