Online learning from finite training sets in nonlinear networks
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
A theoretical comparison of batch-mode, on-line, cyclic, and almost-cyclic learning
IEEE Transactions on Neural Networks
When does online BP training converge?
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
We analyze online gradient descent learning from finite training sets at noninfinitesimal learning rates η . Exact results are obtained for the time-dependent generalization error of a simple model system: a linear network with a large number of weights N, trained on p = αN examples. This allows us to study in detail the effects of finite training set size α on, for example, the optimal choice of learning rate η. We also compare online and offline learning, for respective optimal settings of η at given final learning time. Online learning turns out to be much more robust to input bias and actually outperforms offline learning when such bias is present; for unbiased inputs, online and offline learning perform almost equally well.