Agnostically Learning Halfspaces

Authors:
Adam Tauman Kalai;Adam R. Klivans;Yishay Mansour;Rocco A. Servedio
Affiliations:
TTI-Chicago;UT-Austin;Tel Aviv University;Columbia University
Venue:
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Year:
2005

Citing 26
Cited 18

A theory of the learnable

Communications of the ACM
The perceptron algorithm is fast for nonmalicious distributions

Neural Computation
Estimates of the Hermite and the Freud polynomials

Journal of Approximation Theory
On the degree of Boolean functions as real polynomials

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
On the degree of polynomials that approximate symmetric Boolean functions (preliminary version)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Learning in the presence of malicious errors

SIAM Journal on Computing
Statistical queries and faulty PAC oracles

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Constant depth circuits, Fourier transform, and learnability

Journal of the ACM (JACM)
Toward Efficient Agnostic Learning

Machine Learning - Special issue on computational learning theory, COLT'92
On the sample complexity of weak learning

Information and Computation
On efficient agnostic learning of linear combinations of basis functions

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
On the Fourier spectrum of monotone functions

Journal of the ACM (JACM)
The harmonic sieve: a novel application of Fourier analysis to machine learning theory and practice

The harmonic sieve: a novel application of Fourier analysis to machine learning theory and practice
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Efficient noise-tolerant learning from statistical queries

Journal of the ACM (JACM)
Learning conjuctions with noise under product distributions

Information Processing Letters
On PAC learning using Winnow, Perceptron, and a Perceptron-like algorithm

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Learnability beyond AC0

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
New degree bounds for polynomial threshold functions

Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
An upper bound on the sample complexity of PAC-learning halfspaces with respect to the uniform distribution

Information Processing Letters
Learning intersections and thresholds of halfspaces

Journal of Computer and System Sciences - Special issue on FOCS 2002
Subgradient and sampling algorithms for l1 regression

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
The transition to perfect generalization in perceptrons

Neural Computation
Learning disjunction of conjunctions

IJCAI'85 Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
Efficient agnostic learning of neural networks with bounded fan-in

IEEE Transactions on Information Theory - Part 2
On the sample complexity of PAC learning half-spaces against the uniform distribution

IEEE Transactions on Neural Networks

On a theory of learning with similarity functions

ICML '06 Proceedings of the 23rd international conference on Machine learning
On hardness of learning intersection of two halfspaces

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
The chow parameters problem

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Agnostically learning decision trees

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Some topics in analysis of boolean functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
On agnostic boosting and parity learning

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
A theory of learning with similarity functions

Machine Learning
Baum's Algorithm Learns Intersections of Halfspaces with Respect to Log-Concave Distributions

APPROX '09 / RANDOM '09 Proceedings of the 12th International Workshop and 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
Margin based active learning

COLT'07 Proceedings of the 20th annual conference on Learning theory
A lower bound for agnostically learning disjunctions

COLT'07 Proceedings of the 20th annual conference on Learning theory
The regularized least squares algorithm and the problem of learning halfspaces

Information Processing Letters
Learning hurdles for sleeping experts

Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Uniform-distribution learnability of noisy linear threshold functions with restricted focus of attention

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Hardness results for agnostically learning low-degree polynomial threshold functions

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Nearly optimal solutions for the chow parameters problem and low-weight approximation of halfspaces

STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Reliable agnostic learning

Journal of Computer and System Sciences
Learning linear and kernel predictors with the 0-1 loss function

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Learning Kernel-Based Halfspaces with the 0-1 Loss

SIAM Journal on Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is given access to labeled examples drawn from a distribution, without restriction on the labels (e.g. adversarial noise). The algorithm constructs a hypothesis whose error rate on future examples is within an additive \varepsilon of the optimal halfspace, in time poly(n) for any constant \varepsilon 0, under the uniform distribution over {-1,1}^n or the unit sphere in R^n, as well as under any log-concave distribution over R^n. It also agnostically learns Boolean disjunctions in time b^2 (\sqrt n) with respect to any distribution. The new algorithm, essentially L1 polynomial regression, is a noise-tolerant arbitrary-distribution generalization of the "low-degree" Fourier algorithm of Linial, Mansour, & Nisan. We also give a new algorithm for PAC learning halfspaces under the uniform distribution on the unit sphere with the current best bounds on tolerable rate of "malicious noise."