Learning linear threshold functions in the presence of classification noise

  • Authors:
  • Tom Bylander

  • Affiliations:
  • Division of Mathematics, Computer Science and Statistics, The University of Texas at San Antonio, San Antonio, Texas

  • Venue:
  • COLT '94 Proceedings of the seventh annual conference on Computational learning theory
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

I show that the linear threshold functions are polynomially learnable in the presence of classification noise, i.e., polynomial in n, 1/&egr;, 1/&dgr;, and 1/&sgr;, where n is the number of Boolean attributes, &egr; and &dgr; are the usual accuracy and confidence parameters, and &sgr; indicates the minimum distance of any example from the target hyperplane, which is assumed to be lower than the average distance of the examples from any hyperplane. This result is achieved by modifying the Perceptron algorithm—for each update, a weighted average of a sample of misclassified examples and a correction vector is added to the current weight vector. Similar modifications are shown for the Weighted Majority algorithm. The correction vector is simply the mean of the normalized examples. In the special case of Boolean threshold functions, the modified Perceptron algorithm performs O (n2&egr;−2 ) iterations over O(n4&egr; −2ln(n/(&dgr;&egr;))) examples. This improves on the previous classification-noise result of Angluin and Laird to a much larger concept class with a similar number of examples, but with multiple iterations over the examples.