Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning - Special issue on learning with probabilistic representations
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Incorporating Prior Knowledge into Boosting
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Ontologies Improve Text Document Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating prior knowledge with weighted margin support vector machines
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers
IEEE Transactions on Knowledge and Data Engineering
Enhancing relevance feedback in image retrieval using unlabeled data
ACM Transactions on Information Systems (TOIS)
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
The Journal of Machine Learning Research
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Clustering short texts using wikipedia
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Label Propagation through Linear Neighborhoods
IEEE Transactions on Knowledge and Data Engineering
Survey of Text Mining II: Clustering, Classification, and Retrieval
Survey of Text Mining II: Clustering, Classification, and Retrieval
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Data Clustering: 50 Years Beyond K-means
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Semi-supervised learning using label mean
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A hybrid generative/discriminative approach to semi-supervised classifier design
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Semi-supervised learning with very few labeled training examples
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Semi-Supervised Learning
Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Hi-index | 0.00 |
Usually, semi-supervised learning requires a number of prior knowledge to supervise the learning process, such as, seeds in Seeded-Kmeans, pair-wise constraints in COP-Kmeans, and labeled data for training an initial useful classifier in S3VM. Such prior knowledge is generally provided by the domain expert, so it is very expensive. In this paper, we propose a new automatical document labeling strategy to derive much more prior knowledge based on the very limited labeled data and the whole data set. Experimental results on 20-Newsgroup text data have shown that the new strategy is helpful for semi-supervised document categorization and improves the learning performance.