Training a single sigmoidal neuron is hard

Authors:
Jiří Sima
Affiliations:
Institute of Computer Science, Academy of Sciences of the Czech Republic, P.O. Box 5, 18207 Prague 8, Czech Republic
Venue:
Neural Computation
Year:
2002

Citing 29
Cited 5

Scaling relationships in back-propagation learning

Complex Systems
On the complexity of loading shallow neural networks

Journal of Complexity - Special Issue on Neural Computation
Computational limitations on learning from examples

Journal of the ACM (JACM)
Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
Neural network design and the complexity of learning

Neural network design and the complexity of learning
Complexity Results on Learning by Neural Nets

Machine Learning
The cascade-correlation learning architecture

Advances in neural information processing systems 2
Original Contribution: Training a 3-node neural network is NP-complete

Neural Networks
Computational limitations on training sigmoid neural networks

Information Processing Letters
Finiteness results for sigmoidal “neural” networks

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Feedforward nets for interpolation and classification

Journal of Computer and System Sciences
Loading deep networks is hard

Neural Computation
Robust trainability of single neurons

Journal of Computer and System Sciences
The neural network loading problem is undecidable

Euro-COLT '93 Proceedings of the first European conference on Computational learning theory
On the geometric separability of Boolean functions

Discrete Applied Mathematics
Back-propagation is not efficient

Neural Networks
The hardness of approximate optima in lattices, codes, and systems of linear equations

Journal of Computer and System Sciences - Special issue: papers from the 32nd and 34th annual symposia on foundations of computer science, Oct. 2–4, 1991 and Nov. 3–5, 1993
On the infeasibility of training neural networks with small squared errors

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Training a sigmoidal node is hard

Neural Computation
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems

A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
Theoretical Advances in Neural Computation and Learning

Theoretical Advances in Neural Computation and Learning
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Minimizing the Quadratic Training Error of a Sigmoid Neuron Is Hard

ALT '01 Proceedings of the 12th International Conference on Algorithmic Learning Theory
Hardness Results for General Two-Layer Neural Networks

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
The computational intractability of training sigmoidal neural networks

IEEE Transactions on Information Theory
Classification of linearly nonseparable patterns by linear threshold elements

IEEE Transactions on Neural Networks
On the complexity of training neural networks with continuous activation functions

IEEE Transactions on Neural Networks

Some Dichotomy Theorems for Neural Learning Problems

The Journal of Machine Learning Research
Loading Deep Networks Is Hard: The Pyramidal Case

Neural Computation
On the Nonlearnability of a Single Spiking Neuron

Neural Computation
2005 Special Issue: The loading problem for recursive neural networks

Neural Networks - Special issue on neural networks and kernel methods for structured domains
Long-range out-of-sample properties of autoregressive neural networks

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We first present a brief survey of hardness results for training feedforward neural networks. These results are then completed by the proof that the simplest architecture containing only a single neuron that applies a sigmoidal activation function σ: R → [α, β], satisfying certain natural axioms (e.g., the standard (logistic) sigmoid or saturated-linear function), to the weighted sum of n inputs is hard to train. In particular, the problem of finding the weights of such a unit that minimize the quadratic training error within (β - α)2 or its average (over a training set) within 5(β - α)2/(12n) of its infimum proves to be NP-hard. Hence, the well-known backpropagation learning algorithm appears not to be efficient even for one neuron, which has negative consequences in constructive learning.