Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Using unlabeled data to improve text classification
Using unlabeled data to improve text classification
A unified framework for model-based clustering
The Journal of Machine Learning Research
Semi-supervised model-based document clustering: A comparative study
Machine Learning
Global Optimization for Semi-supervised K-means
APCIP '09 Proceedings of the 2009 Asia-Pacific Conference on Information Processing - Volume 02
Semi-supervised learning with very few labeled training examples
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Text document clustering based on neighbors
Data & Knowledge Engineering
Hi-index | 0.00 |
In recent years, the research of semi-supervised clustering has been paid more and more attention. For most of the semi-supervised clustering algorithms, a good initialization method can create the high-quality seeds which are helpful to improve the clustering accuracy. In the real world, there are few labeled samples but many unlabeled ones, whereas most of the existing initialization methods put the unlabeled data away for clustering which may contain some potentially useful information for clustering tasks. In this paper, we propose a novel initialization method to transfer some of the unlabeled samples into labeled ones, in which the neighbors of labeled samples are identified at first and then the known labels are propagated to the unlabeled ones. Experimental results show that the proposed initialization method can improve the performance of the semi-supervised clustering.