Enhancing the generalization ability of neural networks through controlling the hidden layers

  • Authors:
  • Weishui Wan;Shingo Mabu;Kaoru Shimada;Kotaro Hirasawa;Jinglu Hu

  • Affiliations:
  • Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan;Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan;Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan;Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan;Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan

  • Venue:
  • Applied Soft Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we proposed two new variants of backpropagation algorithm. The common point of these two new algorithms is that the outputs of nodes in the hidden layers are controlled with the aim to solve the moving target problem and the distributed weights problem. One algorithm (AlgoRobust) is not so insensitive to the noises in the data, the second one (AlgoGS) is through using Gauss-Schmidt algorithm to determine in each epoch which weight should be updated, while the other weights are kept unchanged in this epoch. In this way a better generalization can be obtained. Some theoretical explanations are also provided. In addition, simulation comparisons are made between Gaussian regularizer, optimal brain damage (OBD) and the proposed algorithms. Simulation results confirm that the new proposed algorithms perform better than that of Gaussian regularizer, and the first algorithm AlgoRobust performs better than the second algorithm AlgoGS in the noisy data. On the other hand AlgoGS performs better than the AlgoRobust on the data without noise and the final structure obtained by two new algorithms is comparable to that obtained by using OBD.