Probabilistic Labeled Semi-supervised SVM

  • Authors:
  • Mingjie Qian;Feiping Nie;Changshui Zhang

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semi-supervised learning has been paid increasing attention and is widely used in many fields such as data mining, information retrieval and knowledge management as it can utilize both labeled and unlabeled data. Laplacian SVM (LapSVM) is a very classical method whose effectiveness has been validated by large number of experiments. However, LapSVM is sensitive to labeled data and it exposes to cubic computation complexity which limit its application in large scale scenario. In this paper, we propose a multi-class method called Probabilistic labeled Semi-supervised SVM (PLSVM) in which the optimal decision surface is taught by probabilistic labels of all the training data including the labeled and unlabeled data. Then we propose a kernel version dual coordinate descent method to efficiently solve the dual problems of our Probabilistic labeled Semi-supervised SVM and decrease its requirement of memory. Synthetic data and several benchmark real world datasets show that PLSVM is less sensitive to labeling and has better performance over traditional methods like SVM, LapSVM (LapSVM) and Transductive SVM (TSVM).