Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
Fundamentals of speech recognition
Fundamentals of speech recognition
Artificial neural networks and their application to sequence recognition
Artificial neural networks and their application to sequence recognition
Fundamentals of neural networks: architectures, algorithms, and applications
Fundamentals of neural networks: architectures, algorithms, and applications
Neural networks for pattern recognition
Neural networks for pattern recognition
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Fundamentals of Artificial Neural Networks
Fundamentals of Artificial Neural Networks
Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks
Advanced Methods in Neural Computing
Advanced Methods in Neural Computing
New results on recurrent network training: unifying the algorithms and accelerating convergence
IEEE Transactions on Neural Networks
PointMap: A Real-Time Memory-Based Learning System with On-line and Post-Training Pruning
International Journal of Hybrid Intelligent Systems
Learning AAM fitting through simulation
Pattern Recognition
Comparisons of single- and multiple-hidden-layer neural networks
ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part I
Simulation studies of on-line identification of complex processes with neural networks
ISNN'06 Proceedings of the Third international conference on Advnaces in Neural Networks - Volume Part II
Parallel implementation of back-propagation neural network software on SMP computers
PaCT'05 Proceedings of the 8th international conference on Parallel Computing Technologies
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Sparse activity and sparse connectivity in supervised learning
The Journal of Machine Learning Research
Statistical and incremental methods for neural models selection
International Journal of Artificial Intelligence and Soft Computing
Hi-index | 0.00 |
Gradient descent training of neural networks can be done in either a batch or on-line manner. A widely held myth in the neural network community is that batch training is as fast or faster and/or more 'correct' than on-line training because it supposedly uses a better approximation of the true gradient for its weight updates. This paper explains why batch training is almost always slower than on-line training--often orders of magnitude slower--especially on large training sets. The main reason is due to the ability of on-line training to follow curves in the error surface throughout each epoch, which allows it to safely use a larger learning rate and thus converge with less iterations through the training data. Empirical results on a large (20,000-instance) speech recognition task and on 26 other learning tasks demonstrate that convergence can be reached significantly faster using on-line training than batch training, with no apparent difference in accuracy.