Proceedings of the 1998 conference on Advances in neural information processing systems II
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
The use of unlabeled data to improve supervised learning for text summarization
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
On an equivalence between PLSI and LDA
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Relation between PLSA and NMF and implications
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering with prior knowledge
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Regularized clustering for documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Computational Statistics & Data Analysis
Exploring social annotations for information retrieval
Proceedings of the 17th international conference on World Wide Web
Semi-supervised metric learning by maximizing constraint margin
Proceedings of the 17th ACM conference on Information and knowledge management
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Improving music genre classification using collaborative tagging data
Proceedings of the Second ACM International Conference on Web Search and Data Mining
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
An analysis of the use of tags in a blog recommender system
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Flexible constrained spectral clustering
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
An Unsupervised Automated Essay Scoring System
IEEE Intelligent Systems
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
This study employs our proposed semi-supervised clustering method called Constrained-PLSA to cluster tagged documents with a small amount of labeled documents and uses two data sets for system performance evaluations. The first data set is a document set whose boundaries among the clusters are not clear; while the second one has clear boundaries among clusters. This study employs abstracts of papers and the tags annotated by users to cluster documents. Four combinations of tags and words are used for feature representations. The experimental results indicate that almost all of the methods can benefit from tags. However, unsupervised learning methods fail to function properly in the data set with noisy information, but Constrained-PLSA functions properly. In many real applications, background knowledge is ready, making it appropriate to employ background knowledge in the clustering process to make the learning more fast and effective.