Support vector machine active learning with applications to text classification

  • Authors:
  • Simon Tong;Daphne Koller

  • Affiliations:
  • Computer Science Department, Stanford University, Stanford CA;Computer Science Department, Stanford University, Stanford CA

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

Support vector machines have met with significant success in numerous real-world learning tasks. However, like most machine learning algorithms, they are generally applied using a randomly selected training set classified in advance. In many settings, we also have the option of using pool-based active learning. Instead of using a randomly selected training set, the learner has access to a pool of unlabeled instances and can request the labels for some number of them. We introduce a new algorithm for performing active learning with support vector machines, i.e., an algorithm for choosing which instances to request next. We provide a theoretical motivation for the algorithm using the notion of a version space. We present experimental results showing that employing our active learning method can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings.