Minimizing the Quadratic Training Error of a Sigmoid Neuron Is Hard

Authors:
Jirí Síma
Affiliations:
-
Venue:
ALT '01 Proceedings of the 12th International Conference on Algorithmic Learning Theory
Year:
2001

Citing 30
Cited 1

Scaling relationships in back-propagation learning

Complex Systems
On the complexity of loading shallow neural networks

Journal of Complexity - Special Issue on Neural Computation
Computational limitations on learning from examples

Journal of the ACM (JACM)
Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
Neural network design and the complexity of learning

Neural network design and the complexity of learning
Complexity Results on Learning by Neural Nets

Machine Learning
The cascade-correlation learning architecture

Advances in neural information processing systems 2
Original Contribution: Training a 3-node neural network is NP-complete

Neural Networks
Computational limitations on training sigmoid neural networks

Information Processing Letters
Finiteness results for sigmoidal “neural” networks

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Feedforward nets for interpolation and classification

Journal of Computer and System Sciences
Loading deep networks is hard

Neural Computation
Robust trainability of single neurons

Journal of Computer and System Sciences
The neural network loading problem is undecidable

Euro-COLT '93 Proceedings of the first European conference on Computational learning theory
On the geometric separability of Boolean functions

Discrete Applied Mathematics
Back-propagation is not efficient

Neural Networks
The hardness of approximate optima in lattices, codes, and systems of linear equations

Journal of Computer and System Sciences - Special issue: papers from the 32nd and 34th annual symposia on foundations of computer science, Oct. 2–4, 1991 and Nov. 3–5, 1993
On the infeasibility of training neural networks with small squared errors

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Training a sigmoidal node is hard

Neural Computation
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems

A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
Theoretical Advances in Neural Computation and Learning

Theoretical Advances in Neural Computation and Learning
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Hardness Results for Neural Network Approximation Problems

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
On Approximate Learning by Multi-layered Feedforward Circuits

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
Hardness Results for General Two-Layer Neural Networks

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Neural Network Learning: Theoretical Foundations

Neural Network Learning: Theoretical Foundations
The computational intractability of training sigmoidal neural networks

IEEE Transactions on Information Theory
Classification of linearly nonseparable patterns by linear threshold elements

IEEE Transactions on Neural Networks
On the complexity of training neural networks with continuous activation functions

IEEE Transactions on Neural Networks

Training a single sigmoidal neuron is hard

Neural Computation

Quantified Score

Hi-index	0.01

Visualization

Abstract

We first present a brief survey of hardness results for training feedforward neural networks. These results are then completed by the proof that the simplest architecture containing only a single neuron that applies the standard (logistic) activation function to the weighted sum of n inputs is hard to train. In particular, the problem of finding the weights of such a unit that minimize the relative quadratic training error within 1 or its average (over a training set) within 13/(31n) of its infimum proves to be NP-hard. Hence, the well-known back-propagation learning algorithm appears to be not efficient even for one neuron which has negative consequences in constructive learning.