Learning translation invariant recognition in massively parallel networks
Volume I: Parallel architectures on PARLE: Parallel Architectures and Languages Europe
Comparing biases for minimal network construction with back-propagation
Advances in neural information processing systems 1
Neural Computation
Bayesian regularization and pruning using a Laplace prior
Neural Computation
Structural learning with forgetting
Neural Networks
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
MLP in layer-wise form with applications to weight decay
Neural Computation
Extracting regression rules from neural networks
Neural Networks
Discovering Polynomials to Fit Multivariate Data Having Numeric and Nominal Variables
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Finding Polynomials to Fit Multivariate Data Having Numeric and Nominal Variables
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Robust Formulations for Training Multilayer Perceptrons
Neural Computation
Convergence of an online gradient algorithm with penalty for two-layer neural networks
MATH'06 Proceedings of the 10th WSEAS International Conference on APPLIED MATHEMATICS
Boundedness and convergence of online gradient method with penalty for feedforward neural networks
IEEE Transactions on Neural Networks
Convergence of batch BP algorithm with penalty for FNN training
ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Allocation of simulation effort for neural network vs. regression metamodels
Proceedings of the Winter Simulation Conference
Hi-index | 0.00 |
This article compares three penalty terms with respect to the efficiency of supervised learning, by using first- and second-order off-line learning algorithms and a first-order on-line algorithm. Our experiments showed that for a reasonably adequate penalty factor, the combination of the squared penalty term and the second-order learning algorithm drastically improves the convergence performance in comparison to the other combinations, at the same time bringing about excellent generalization performance. Moreover, in order to understand how differently each penalty term works, a function surface evaluation is described. Finally, we show how cross validation can be applied to find an optimal penalty factor.