Determining regularization parameters for derivative free neural learning

  • Authors:
  • Ranadhir Ghosh;Moumita Ghosh;John Yearwood;Adil Bagirov

  • Affiliations:
  • School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia;School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia;School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia;School of InformationTechnology and Mathematical Sciences, University of Ballarat, Ballarat, Australia

  • Venue:
  • MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Derivative free optimization methods have recently gained a lot of attractions for neural learning. The curse of dimensionality for the neural learning problem makes local optimization methods very attractive; however the error surface contains many local minima. Discrete gradient method is a special case of derivative free methods based on bundle methods and has the ability to jump over many local minima. There are two types of problems that are associated with this when local optimization methods are used for neural learning. The first type of problems is initial sensitivity dependence problem – that is commonly solved by using a hybrid model. Our early research has shown that discrete gradient method combining with other global methods such as evolutionary algorithm makes them even more attractive. These types of hybrid models have been studied by other researchers also. Another less mentioned problem is the problem of large weight values for the synaptic connections of the network. Large synaptic weight values often lead to the problem of paralysis and convergence problem especially when a hybrid model is used for fine tuning the learning task. In this paper we study and analyse the effect of different regularization parameters for our objective function to restrict the weight values without compromising the classification accuracy.