Agnostic active learning

  • Authors:
  • Maria-Florina Balcan;Alina Beygelzimer;John Langford

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA 15213, USA;IBM T. J. Watson Research Center, Hawthorne, NY 10532, USA;Yahoo! Research, New York, NY 10011, USA

  • Venue:
  • Journal of Computer and System Sciences
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We state and analyze the first active learning algorithm that finds an @e-optimal hypothesis in any hypothesis class, when the underlying distribution has arbitrary forms of noise. The algorithm, A^2 (for Agnostic Active), relies only upon the assumption that it has access to a stream of unlabeled examples drawn i.i.d. from a fixed distribution. We show that A^2 achieves an exponential improvement (i.e., requires only O(ln1@e) samples to find an @e-optimal classifier) over the usual sample complexity of supervised learning, for several settings considered before in the realizable case. These include learning threshold classifiers and learning homogeneous linear separators with respect to an input distribution which is uniform over the unit sphere.