Learning noisy linear classifiers via adaptive and selective sampling

  • Authors:
  • Giovanni Cavallanti;Nicolò Cesa-Bianchi;Claudio Gentile

  • Affiliations:
  • DSI, Università degli Studi di Milano, Milano, Italy;DSI, Università degli Studi di Milano, Milano, Italy;DICOM, Università dell'Insubria, Varese, Italy

  • Venue:
  • Machine Learning
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce efficient margin-based algorithms for selective sampling and filtering in binary classification tasks. Experiments on real-world textual data reveal that our algorithms perform significantly better than popular and similarly efficient competitors. Using the so-called Mammen-Tsybakov low noise condition to parametrize the instance distribution, and assuming linear label noise, we show bounds on the convergence rate to the Bayes risk of a weaker adaptive variant of our selective sampler. Our analysis reveals that, excluding logarithmic factors, the average risk of this adaptive sampler converges to the Bayes risk at rate N驴(1+驴)(2+驴)/2(3+驴) where N denotes the number of queried labels, and 驴0 is the exponent in the low noise condition. For all $\alpha\sqrt{3}-1\approx0.73$ this convergence rate is asymptotically faster than the rate N驴(1+驴)/(2+驴) achieved by the fully supervised version of the base selective sampler, which queries all labels. Moreover, for 驴驴驴 (hard margin condition) the gap between the semi- and fully-supervised rates becomes exponential.