Clustering documents with labeled and unlabeled documents using fuzzy semi-Kmeans

Authors:
Chien-Liang Liu;Tao-Hsing Chang;Hsuan-Hsun Li
Affiliations:
Information and Communications Research Laboratories, Industrial Technology Research Institute, Rm. 709, Bldg. 51, 195, Sec. 4, Chung Hsing Rd., Chutung, Hsinchu 310, Taiwan, ROC;Department of Computer Science and Information Engineering, National Kaohsiung University of Applied Sciences, Chien Kung Campus 415, Chien Kung Road, Kaohsiung 807, Taiwan, ROC;Department of Computer Science, National Chiao Tung University, 1001 University Road, Hsinchu 300, Taiwan, ROC
Venue:
Fuzzy Sets and Systems
Year:
2013

Citing 26
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
Learning from dyadic data

Proceedings of the 1998 conference on Advances in neural information processing systems II
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Laplacian Eigenmaps for dimensionality reduction and data representation

Neural Computation
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised clustering with support vector machines

ICML '05 Proceedings of the 22nd international conference on Machine learning
Document clustering with prior knowledge

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Semi-supervised model-based document clustering: A comparative study

Machine Learning
Introduction to Information Retrieval

Introduction to Information Retrieval
A systematic analysis of performance measures for classification tasks

Information Processing and Management: an International Journal
Seeing stars when there aren't many stars: graph-based semi-supervised learning for sentiment categorization

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Partially supervised clustering for image segmentation

Pattern Recognition
A semi-supervised clustering algorithm for data exploration

IFSA'03 Proceedings of the 10th international fuzzy systems association World Congress conference on Fuzzy sets and systems
Text document clustering with metric learning

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Flexible constrained spectral clustering

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Locality sensitive C-means clustering algorithms

Neurocomputing
Semi-supervised Fuzzy c-Means Clustering Using Clusterwise Tolerance Based Pairwise Constraints

GRC '10 Proceedings of the 2010 IEEE International Conference on Granular Computing
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Fuzzy clustering with partial supervision

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.20

Visualization

Abstract

While focusing on document clustering, this work presents a fuzzy semi-supervised clustering algorithm called fuzzy semi-Kmeans. The fuzzy semi-Kmeans is an extension of K-means clustering model, and it is inspired by an EM algorithm and a Gaussian mixture model. Additionally, the fuzzy semi-Kmeans provides the flexibility to employ different fuzzy membership functions to measure the distance between data. This work employs Gaussian weighting function to conduct experiments, but cosine similarity function can be used as well. This work conducts experiments on three data sets and compares fuzzy semi-Kmeans with several methods. The experimental results indicate that fuzzy semi-Kmeans can generally outperform the other methods.