Active learning using on-line algorithms

  • Authors:
  • Chris Mesterharm;Michael J. Pazzani

  • Affiliations:
  • Rutgers, The State University of New Jersey, Piscataway, NJ, USA;Rutgers, The State University of New Jersey, Piscataway, NJ, USA

  • Venue:
  • Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a new technique and analysis for using on-line learning algorithms to solve active learning problems. Our algorithm is called Active Vote, and it works by actively selecting instances that force several perturbed copies of an on-line algorithm to make mistakes. The main intuition for our result is based on the fact that the number of mistakes made by the optimal on-line algorithm is a lower bound on the number of labels needed for active learning. We provide performance bounds for Active Vote in both a batch and on-line model of active learning. These performance bounds depend on the algorithm having a set of unlabeled instances in which the various perturbed on-line algorithms disagree. The motivating application for Active Vote is an Internet advertisement rating program. We conduct experiments using data collected for this advertisement problem along with experiments using standard datasets. We show Active Vote can achieve an order of magnitude decrease in the number of labeled instances over various passive learning algorithms such as Support Vector Machines.