On Agnostic Learning of Parities, Monomials, and Halfspaces

  • Authors:
  • Vitaly Feldman;Parikshit Gopalan;Subhash Khot;Ashok Kumar Ponnuswami

  • Affiliations:
  • vitaly@post.harvard.edu;parik@microsoft.com;khot@cc.gatech.edu and pashok@cc.gatech.edu;-

  • Venue:
  • SIAM Journal on Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the learnability of several fundamental concept classes in the agnostic learning framework of [D. Haussler, Inform. and Comput., 100 (1992), pp. 78-150] and [M. Kearns, R. Schapire, and L. Sellie, Machine Learning, 17 (1994), pp. 115-141]. We show that under the uniform distribution, agnostically learning parities reduce to learning parities with random classification noise, commonly referred to as the noisy parity problem. Together with the parity learning algorithm of [A. Blum, A. Kalai, and H. Wasserman, J. ACM, 50 (2003), pp. 506-519], this gives the first nontrivial algorithm for agnostic learning of parities. We use similar techniques to reduce learning of two other fundamental concept classes under the uniform distribution to learning of noisy parities. Namely, we show that learning of disjunctive normal form (DNF) expressions reduces to learning noisy parities of just logarithmic number of variables, and learning of $k$-juntas reduces to learning noisy parities of $k$ variables. We give essentially optimal hardness results for agnostic learning of monomials over $\{0,1\}^n$ and halfspaces over $\mathbb{Q}^n$. We show that for any constant $\epsilon$ finding a monomial (halfspace) that agrees with an unknown function on $1/2+\epsilon$ fraction of the examples is NP-hard even when there exists a monomial (halfspace) that agrees with the unknown function on $1-\epsilon$ fraction of the examples. This resolves an open question due to Blum and significantly improves on a number of previous hardness results for these problems. We extend these results to $\epsilon=2^{-\log^{1-\lambda}n}$ ($\epsilon=2^{-\sqrt{\log n}}$ in the case of halfspaces) for any constant $\lambda0$ under stronger complexity assumptions.