Relation between weight size and degree of over-fitting in neural network regression

Authors:
Katsuyuki Hagiwara;Kenji Fukumizu
Affiliations:
Faculty of Education, Mie University, 1577 Kurima-Machiya-cho, Tsu 514-8507, Japan;Institute of Statistical Mathematics, ROIS 4-6-7 Minami-azabu, Minato-ku, Tokyo 106-8569, Japan
Venue:
Neural Networks
Year:
2008

Citing 10
Cited 2

Original Contribution: Uniqueness of the weights for minimal feedforward nets with a given input-output map

Neural Networks
Decision theoretic generalizations of the PAC model for neural net and other learning applications

Information and Computation
Statistical theory of learning curves under entropic loss criterion

Neural Computation
A regularity condition of the information matrix of a multilayer perceptron network

Neural Networks
On the problem in model selection of neural network regression in overrealizable scenario

Neural Computation
On the asymptotic distribution of the least-squares estimators in unidentifiable models

Neural Computation
Neural Network Learning: Theoretical Foundations

Neural Network Learning: Theoretical Foundations
The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

IEEE Transactions on Information Theory
Nonparametric estimation and classification using radial basis function nets and empirical risk minimization

IEEE Transactions on Neural Networks
Network information criterion-determining the number of hidden units for an artificial neural network model

IEEE Transactions on Neural Networks

General bound of overfitting for MLP regression models

Neurocomputing
A new constructive neural network method for noise processing and its application on stock market prediction

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates the relation between over-fitting and weight size in neural network regression. The over-fitting of a network to Gaussian noise is discussed. Using re-parametrization, a network function is represented as a bounded function g multiplied by a coefficient c. This is considered to bound the squared sum of the outputs of g at given inputs away from a positive constant @d"n, which restricts the weight size of a network and enables the probabilistic upper bound of the degree of over-fitting to be derived. This reveals that the order of the probabilistic upper bound can change depending on @d"n. By applying the bound to analyze the over-fitting behavior of one Gaussian unit, it is shown that the probability of obtaining an extremely small value for the width parameter in training is close to one when the sample size is large.