Determining regularization parameters for derivative free neural learning

Authors:
Ranadhir Ghosh;Moumita Ghosh;John Yearwood;Adil Bagirov
Affiliations:
School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia;School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia;School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia;School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia
Venue:
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Year:
2005

Citing 12
Cited 1

Future paths for integer programming and links to artificial intelligence

Computers and Operations Research - Special issue: Applications of integer programming
Global optimization

Global optimization
Generalization by weight-elimination with application to forecasting

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Neural networks and the bias/variance dilemma

Neural Computation
Practical neural network recipes in C++

Practical neural network recipes in C++
Advanced algorithms for neural networks: a C++ sourcebook

Advanced algorithms for neural networks: a C++ sourcebook
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks
Alternative Neural Network Training Methods

IEEE Expert: Intelligent Systems and Their Applications
Training feedforward neural networks using genetic algorithms

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
On the uniqueness of weights in single-layer perceptrons

IEEE Transactions on Neural Networks

Performance of derivative free search ANN training algorithm with time series and classification problems

Computational Statistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Derivative free optimization methods have recently gained a lot of attractions for neural learning. The curse of dimensionality for the neural learning problem makes local optimization methods very attractive; however the error surface contains many local minima. Discrete gradient method is a special case of derivative free methods based on bundle methods and has the ability to jump over many local minima. There are two types of problems that are associated with this when local optimization methods are used for neural learning. The first type of problems is initial sensitivity dependence problem – that is commonly solved by using a hybrid model. Our early research has shown that discrete gradient method combining with other global methods such as evolutionary algorithm makes them even more attractive. These types of hybrid models have been studied by other researchers also. Another less mentioned problem is the problem of large weight values for the synaptic connections of the network. Large synaptic weight values often lead to the problem of paralysis and convergence problem especially when a hybrid model is used for fine tuning the learning task. In this paper we study and analyse the effect of different regularization parameters for our objective function to restrict the weight values without compromising the classification accuracy.