Learning Halfspaces with Malicious Noise
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Learning Halfspaces with Malicious Noise
The Journal of Machine Learning Research
Bounding the average sensitivity and noise sensitivity of polynomial threshold functions
Proceedings of the forty-second ACM symposium on Theory of computing
An invariance principle for polytopes
Proceedings of the forty-second ACM symposium on Theory of computing
Learning and lower bounds for AC0 with threshold gates
APPROX/RANDOM'10 Proceedings of the 13th international conference on Approximation, and 14 the International conference on Randomization, and combinatorial optimization: algorithms and techniques
On the hardness of learning intersections of two halfspaces
Journal of Computer and System Sciences
Competing against the best nearest neighbor filter in regression
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Bounded Independence Fools Halfspaces
SIAM Journal on Computing
SIAM Journal on Computing
Making polynomials robust to noise
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
A complete characterization of statistical query learning with applications to evolvability
Journal of Computer and System Sciences
An invariance principle for polytopes
Journal of the ACM (JACM)
Dual lower bounds for approximate degree and markov-bernstein inequalities
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I
Improved Approximation of Linear Threshold Functions
Computational Complexity
Hi-index | 0.00 |
We give a computationally efficient algorithm that learns (under distributional assumptions) a halfspace in the difficult agnostic framework of Kearns, Schapire, and Sellie [Mach. Learn., 17 (1994), pp. 115-141], where a learner is given access to a distribution on labelled examples but where the labelling may be arbitrary (similar to malicious noise). It constructs a hypothesis whose error rate on future examples is within an additive $\epsilon$ of the optimal halfspace, in time poly$(n)$ for any constant $\epsilon0$, for the uniform distribution over $\{-1,1\}^n$ or unit sphere in $\mathbb R^n,$ as well as any log-concave distribution in $\mathbb R^n$. It also agnostically learns Boolean disjunctions in time $2^{\tilde{O}(\sqrt{n})}$ with respect to any distribution. Our algorithm, which performs $L_1$ polynomial regression, is a natural noise-tolerant arbitrary-distribution generalization of the well-known “low-degree” Fourier algorithm of Linial, Mansour, and Nisan. We observe that significant improvements on the running time of our algorithm would yield the fastest known algorithm for learning parity with noise, a challenging open problem in computational learning theory.