Semi-supervised clustering ensemble based on collaborative training

Authors:
Jinyuan Zhang;Yan Yang;Hongjun Wang;Amjad Mahmood;Feifei Huang
Affiliations:
School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China;School of Information Science & Technology, Southwest Jiaotong University, Chengdu, P.R. China,Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, P.R. China
Venue:
RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
Year:
2012

Citing 6
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers

IEEE Transactions on Knowledge and Data Engineering
On voting-based consensus of cluster ensembles

Pattern Recognition
When Does Cotraining Work in Real Data?

IEEE Transactions on Knowledge and Data Engineering
Topic discovery from document using ant-based clustering combination

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent researches on data clustering is increasingly focusing on combining multiple data partitions as a way to improve the robustness of clustering solutions. Most of them focused on crisp clustering combination. Semi-supervised clustering uses a small amount of labeled data to aid and bias the clustering of unlabeled data. However, in this paper, we offer a semi-supervised clustering ensemble model based on collaborative training (SCET) and an unsupervised clustering ensemble mode based on collaborative training (UCET). In the ensemble step of SCET, semi-supervised learning is introduced. While in UCET, the knowledge used in SCET is replaced by information extracted from the base-clusterings. Then tri-training is used as consensus of clustering ensemble. The experiments on datasets from UCI machine learning repository indicate that the model improves the accuracy of clustering.