Convex and scalable weakly labeled SVMs

  • Authors:
  • Yu-Feng Li;Ivor W. Tsang;James T. Kwok;Zhi-Hua Zhou

  • Affiliations:
  • National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;School of Computer Engineering, Nanyang Technological University, Singapore;Department of Computer Science and Engineering, Hong Kong University of Science & Technology, Hong Kong;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study the problem of learning from weakly labeled data, where labels of the training examples are incomplete. This includes, for example, (i) semi-supervised learning where labels are partially known; (ii) multi-instance learning where labels are implicitly known; and (iii) clustering where labels are completely unknown. Unlike supervised learning, learning with weak labels involves a difficult Mixed-Integer Programming (MIP) problem. Therefore, it can suffer from poor scalability and may also get stuck in local minimum. In this paper, we focus on SVMs and propose the WELLSVM via a novel label generation strategy. This leads to a convex relaxation of the original MIP, which is at least as tight as existing convex Semi-Definite Programming (SDP) relaxations. Moreover, the WELLSVM can be solved via a sequence of SVM subproblems that are much more scalable than previous convex SDP relaxations. Experiments on three weakly labeled learning tasks, namely, (i) semi-supervised learning; (ii) multi-instance learning for locating regions of interest in content-based information retrieval; and (iii) clustering, clearly demonstrate improved performance, and WELLSVM is also readily applicable on large data sets.