How to construct random functions
Journal of the ACM (JACM)
Crytographic limitations on learning Boolean formulae and finite automata
STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
Neural network design and the complexity of learning
Neural network design and the complexity of learning
Decision theoretic generalizations of the PAC model for neural net and other learning applications
Information and Computation
Efficient learning of continuous neural networks
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Robust trainability of single neurons
Journal of Computer and System Sciences
The hardness of approximation: gap location
Computational Complexity
The hardness of approximate optima in lattices, codes, and systems of linear equations
Journal of Computer and System Sciences - Special issue: papers from the 32nd and 34th annual symposia on foundations of computer science, Oct. 2–4, 1991 and Nov. 3–5, 1993
On the infeasibility of training neural networks with small squared errors
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
On the Difficulty of Approximately Maximizing Agreements
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
On the Hardness of Approximating Max k-Cut and Its Dual
On the Hardness of Approximating Max k-Cut and Its Dual
Efficient agnostic learning of neural networks with bounded fan-in
IEEE Transactions on Information Theory - Part 2
The computational intractability of training sigmoidal neural networks
IEEE Transactions on Information Theory
On the complexity of training neural networks with continuous activation functions
IEEE Transactions on Neural Networks
Loading Deep Networks Is Hard: The Pyramidal Case
Neural Computation
On approximate learning by multi-layered feedforward circuits
Theoretical Computer Science - Algorithmic learning theory (ALT 2000)
Hi-index | 5.23 |
We consider the problem of efficiently learning in two-layer neural networks. We investigate the computational complexity of agnostically learning with simple families of neural networks as the hypothesis classes. We show that it is NP-hard to find a linear threshold network of a fixed size that approximately minimizes the proportion of misclassified examples in a training set, even if there is a network that correctly classifies all of the training examples. In particular, for a training set that is correctly classified by some two-layer linear threshold network with k hidden units, it is NP-hard to find such a network that makes mistakes on a proportion smaller than c/k2 of the examples, for some constant c. We prove a similar result for the problem of approximately minimizing the quadratic loss of a two-layer network with a sigmoid output unit.