On the hardness of learning intersections of two halfspaces
Journal of Computer and System Sciences
Hardness results for agnostically learning low-degree polynomial threshold functions
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Hi-index | 0.00 |
Learning an unknown halfspace (also called a perceptron) from labeled examples is one of the classic problems in machine learning. In the noise-free case, when a halfspace consistent with all the training examples exists, the problem can be solved in polynomial time using linear programming. However, under the promise that a halfspace consistent with a fraction $(1-\varepsilon)$ of the examples exists (for some small constant $\varepsilon0$), it was not known how to efficiently find a halfspace that is correct on even 51% of the examples. Nor was a hardness result that ruled out getting agreement on more than 99.9% of the examples known. In this work, we close this gap in our understanding and prove that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense. Specifically, for arbitrary $\epsilon,\delta 0$, we prove that given a set of examples-label pairs from the hypercube, a fraction $(1-\varepsilon)$ of which can be explained by a halfspace, it is NP-hard to find a halfspace that correctly labels a fraction $(1/2+\delta)$ of the examples. The hardness result is tight since it is trivial to get agreement on $1/2$ the examples. In learning theory parlance, we prove that weak proper agnostic learning of halfspaces is hard. This settles a question that was raised by Blum et al., in their work on learning halfspaces in the presence of random classification noise [Algorithmica, 22 (1998), pp. 35-52], and raised by authors of some more recent works as well. Along the way, we also obtain a strong hardness result for another basic computational problem: solving a linear system over the rationals.