Multilayer feedforward networks are universal approximators
Neural Networks
Approximation by superposition of sigmoidal and radial basis functions
Advances in Applied Mathematics
On the bent boolean functions that are symmetric
European Journal of Combinatorics
The nature of statistical learning theory
The nature of statistical learning theory
Uniform approximation by neural networks
Journal of Approximation Theory
Backpropagation: basics and new developments
The handbook of brain theory and neural networks
A better approximation for balls
Journal of Approximation Theory
Geometry and topology of continuous best and near best approximations
Journal of Approximation Theory
Feedforward Neural Network Methodology
Feedforward Neural Network Methodology
Best approximation by linear combinations of characteristic functions of half-spaces
Journal of Approximation Theory
Journal of Approximation Theory
Estimates of Network Complexity and Integral Representations
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Complexity of Gaussian-radial-basis networks approximating smooth functions
Journal of Complexity
Hi-index | 0.00 |
Supervised learning of perceptron networks is investigated as an optimization problem. It is shown that both the theoretical and the empirical error functionals achieve minima over sets of functions computable by networks with a given number n of perceptrons. Upper bounds on rates of convergence of these minima with n increasing are derived. The bounds depend on a certain regularity of training data expressed in terms of variational norms of functions interpolating the data (in the case of the empirical error) and the regression function (in the case of the expected error). Dependence of this type of regularity on dimensionality and on magnitudes of partial derivatives is investigated. Conditions on the data, which guarantee that a good approximation of global minima of error functionals can be achieved using networks with a limited complexity, are derived. The conditions are in terms of oscillatory behavior of the data measured by the product of a function of the number of variables d, which is decreasing exponentially fast, and the maximum of the magnitudes of the squares of the L1-norms of the iterated partial derivatives of the order d of the regression function or some function, which interpolates the sample of the data. The results are illustrated by examples of data with small and high regularity constructed using Boolean functions and the gaussian function.