Multilayer feedforward networks are universal approximators
Neural Networks
What size net gives valid generalization?
Neural Computation
Training a 3-node neural network is NP-complete
COLT '88 Proceedings of the first annual workshop on Computational learning theory
Generalization properties of radial basis functions
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Generalization by weight-elimination with application to forecasting
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Can neural networks do better than the Vapnik-Chervonenkis bounds?
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Neural networks and the bias/variance dilemma
Neural Computation
Approximation by superposition of sigmoidal and radial basis functions
Advances in Applied Mathematics
Decision theoretic generalizations of the PAC model for neural net and other learning applications
Information and Computation
Approximation and Estimation Bounds for Artificial Neural Networks
Machine Learning - Special issue on computational learning theory
Regularization theory and neural networks architectures
Neural Computation
The informational complexity of learning from examples
The informational complexity of learning from examples
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Machine Learning
Machine Learning
On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions
Neural network design and the complexity of learning
Neural network design and the complexity of learning
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Neural Computation
Fast learning in networks of locally-tuned processing units
Neural Computation
Hybrid learning of mapping and its Jacobian in multilayer neural networks
Neural Computation
Shape quantization and recognition with randomized trees
Neural Computation
On different facets of regularization theory
Neural Computation
Mathematics and Computers in Simulation
Support Vector Machine Soft Margin Classifiers: Error Analysis
The Journal of Machine Learning Research
SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming
Neural Computation
Learnability of Gaussians with Flexible Variances
The Journal of Machine Learning Research
Efficient sampling in approximate dynamic programming algorithms
Computational Optimization and Applications
Nonlinear systems control using self-constructing wavelet networks
Applied Soft Computing
Parzen windows for multi-class classification
Journal of Complexity
Prediction of a Lorenz chaotic attractor using two-layer perceptron neural network
Applied Soft Computing
Computational Optimization and Applications
Least square regression with lp-coefficient regularization
Neural Computation
Boosting GARCH and neural networks for the prediction of heteroskedastic time series
Mathematical and Computer Modelling: An International Journal
Generalization ability of fractional polynomial models
Neural Networks
Hi-index | 0.00 |
Feedforward networks together with their training algorithms are a class of regression techniques that can be used to learn to perform some task from a set of examples. The question of generalization of network performance from a finite training set to unseen data is clearly of crucial importance. In this article we first show that the generalization error can be decomposed into two terms: the approximation error, due to the insufficient representational capacity of a finite sized network, and the estimation error, due to insufficient information about the target function because of the finite number of samples. We then consider the problem of learning functions belonging to certain Sobolev spaces with gaussian radial basis functions. Using the above-mentioned decomposition we bound the generalization error in terms of the number of basis functions and number of examples. While the bound that we derive is specific for radial basis functions, a number of observations deriving from it apply to any approximation technique. Our result also sheds light on ways to choose an appropriate network architecture for a particular problem and the kinds of problems that can be effectively solved with finite resources, i.e., with a finite number of parameters and finite amounts of data.