Learning linear threshold functions in the presence of classification noise

Authors:
Tom Bylander
Affiliations:
Division of Mathematics, Computer Science and Statistics, The University of Texas at San Antonio, San Antonio, Texas
Venue:
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Year:
1994

Citing 7
Cited 16

Learning from good and bad data

Learning from good and bad data
Mistake bounds and logarithmic linear-threshold learning algorithms

Mistake bounds and logarithmic linear-threshold learning algorithms
Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Robust trainability of single neurons

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Efficient noise-tolerant learning from statistical queries

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Polynomial learnability of linear threshold approximations

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Learning From Noisy Examples

Machine Learning

Learning probabilistically consistent linear threshold functions

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Robust logics

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Smoothed analysis of the perceptron algorithm for linear programming

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Combining the Perceptron Algorithm with Logarithmic Simulated Annealing

Neural Processing Letters
An Algorithmic Theory of Learning: Robust Concepts and Random Projection

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
A simple polynomial-time rescaling algorithm for solving linear programs

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
An algorithmic theory of learning: Robust concepts and random projection

Machine Learning
Evolvability from learning algorithms

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Learning Kernel Perceptrons on Noisy Data Using Random Projections

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
A team of continuous-action learning automata for noise-tolerant learning of half-spaces

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An efficient re-scaled perceptron algorithm for conic systems

COLT'07 Proceedings of the 20th annual conference on Learning theory
Worst-case absolute loss bounds for linear learning algorithms

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
A cost-sensitive technique for positive-example learning supporting content-based product recommendations in B-to-C e-commerce

Decision Support Systems
A complete characterization of statistical query learning with applications to evolvability

Journal of Computer and System Sciences
Logarithmic simulated annealing for X-ray diagnosis

Artificial Intelligence in Medicine
Sublinear optimization for machine learning

Journal of the ACM (JACM)

Quantified Score

Hi-index	0.00

Visualization

Abstract

I show that the linear threshold functions are polynomially learnable in the presence of classification noise, i.e., polynomial in n, 1/&egr;, 1/&dgr;, and 1/&sgr;, where n is the number of Boolean attributes, &egr; and &dgr; are the usual accuracy and confidence parameters, and &sgr; indicates the minimum distance of any example from the target hyperplane, which is assumed to be lower than the average distance of the examples from any hyperplane. This result is achieved by modifying the Perceptron algorithm—for each update, a weighted average of a sample of misclassified examples and a correction vector is added to the current weight vector. Similar modifications are shown for the Weighted Majority algorithm. The correction vector is simply the mean of the normalized examples. In the special case of Boolean threshold functions, the modified Perceptron algorithm performs O (n2&egr;−2 ) iterations over O(n4&egr; −2ln(n/(&dgr;&egr;))) examples. This improves on the previous classification-noise result of Angluin and Laird to a much larger concept class with a similar number of examples, but with multiple iterations over the examples.