On a learnability question associated to neural networks with continuous activations (extended abstract)

Authors:
Bhaskar DasGupta;Hava T. Siegelmann;Eduardo Sontag
Affiliations:
Department of Computer Science, University of Minnesota, Minneapolis, MN;Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel;Department of Mathematics, Rutgers University, New Brunswick, NJ
Venue:
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Year:
1994

Citing 18
Cited 1

On the learnability of Boolean formulae

STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Combinatorial optimization: algorithms and complexity

Combinatorial optimization: algorithms and complexity
On the complexity of loading shallow neural networks

Journal of Complexity - Special Issue on Neural Computation
Neural network design and the complexity of learning

Neural network design and the complexity of learning
What size net gives valid generalization?

Neural Computation
Complexity Results on Learning by Neural Nets

Machine Learning
Approximation and estimation bounds for artificial neural networks

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Original Contribution: Training a 3-node neural network is NP-complete

Neural Networks
Original Contribution: Uniqueness of the weights for minimal feedforward nets with a given input-output map

Neural Networks
Original Contribution: A quantitative analysis of the behaviors of the PLN network

Neural Networks
Computational limitations on training sigmoid neural networks

Information Processing Letters
Finiteness results for sigmoidal “neural” networks

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Bounds for the computational power and learning complexity of analog neural nets

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Rate of approximation results motivated by robust neural network learning

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers

COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Feedforward nets for interpolation and classification

Journal of Computer and System Sciences
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
The Power of Approximation: A Comparison of Activation Functions

Advances in Neural Information Processing Systems 5, [NIPS Conference]

Web proxy cache replacement scheme based on back-propagation neural network

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with learnability of concept classes defined by neural networks, showing the hardness of PAC-learning (in the complexity, not merely information-theoretic sense) for networks with a particular class of activation. The obstruction lies not with the VC dimension, which is known to grow slowly; instead, the result follows the fact that the loading problem is NP-complete. (The complexity scales badly with input dimension; the loading problem is polynomial-time if the input dimension is constant.) Similar and well-known theorems had already been proved by Megiddo and by Blum and Rivest, for binary-threshold networks. It turns out the general problem for continuous sigmoidal-type functions, as used in practical applications involving steepest descent, is not NP-hard—there are “sigmoidals” for which the problem is in fact trivial—so it is an open question to determine what properties of the activation function cause difficulties. Ours is the first result on the hardness of loading networks which do not consist of binary neurons; we employ a piecewise-linear activation function that has been used in the neural network literature. Our theoretical results lend further justification to the use of incremental (architecture-changing) techniques for training networks.