An OL(n3) primal interior point algorithm for convex quadratic programming
Mathematical Programming: Series A and B
The nature of statistical learning theory
The nature of statistical learning theory
Semi-supervised support vector machines
Proceedings of the 1998 conference on Advances in neural information processing systems II
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Neural Computation
Trading convexity for scalability
ICML '06 Proceedings of the 23rd international conference on Machine learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Estimation of Dependences Based on Empirical Data: Empirical Inference Science (Information Science and Statistics)
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
The Journal of Machine Learning Research
Semi-supervised spam filtering: does it work?
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Semi-supervised Learning from General Unlabeled Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
On Efficient Large Margin Semisupervised Learning: Method and Theory
The Journal of Machine Learning Research
Semi-supervised learning with very few labeled training examples
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Introduction to Semi-Supervised Learning
Introduction to Semi-Supervised Learning
IEEE Transactions on Knowledge and Data Engineering
Semi-supervised learning by disagreement
Knowledge and Information Systems
Hi-index | 0.00 |
Previous semi-supervised learning (SSL) techniques usually assume unlabeled data are relevant to the target task. That is, they follow the same distribution as the targeted labeled data. In this paper, we address a different and very difficult scenario in SSL, where the unlabeled data may be a mixture of data relevant or irrelevant to the target binary classification task. In our framework, we do not require explicitly prior knowledge on the relatedness of the unlabeled data to the target data. In order to alleviate the effect of the irrelevant unlabeled data and utilize the implicit knowledge among all available data, we develop a novel maximum margin classifier, named the tri-class support vector machine (3C-SVM), to seek an inductive rule to separate the target binary classification task well while finding out the irrelevant data by-product. To attain this goal, we introduce a new min loss function, which can relieve the impact of the irrelevant data while relying more on the labeled data and the relevant unlabeled data. This loss function can therefore achieve the maximum entropy principle. The 3C-SVM can then generalize standard SVMs, Semi-supervised SVMs, and SVMs learned from the universum as its special cases. We further analyze the property of 3C-SVM on why the irrelevant data can help to improve the model performance. For implementation, we make relaxation and approximate the objective by the convex-concave procedure, which turns the original optimization from integral programming problem to a problem by just solving a finite number of quadratic programming problems. Empirical results are reported to demonstrate the advantages of our 3C-SVM model.