Accelerating the Convergence of EM-Based Training Algorithms for RBF Networks

Authors:
Marcelino Lázaro;Ignacio Santamaría;Carlos Pantaleón
Affiliations:
-;-;-
Venue:
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I
Year:
2001

Citing 6
Cited 1

Hierarchical mixtures of experts and the EM algorithm

Neural Computation
An efficient EM-based training algorithm for feedforward neural networks

Neural Networks
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Fast learning in networks of locally-tuned processing units

Neural Computation
Comparison of adaptive methods for function estimation from samples

IEEE Transactions on Neural Networks
Fast training of recurrent networks based on the EM algorithm

IEEE Transactions on Neural Networks

EM-Based Radial Basis Function Training with Partial Information

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a new Expectation-Maximization (EM) algorithm which speeds up the training of feedforward networks with local activation functions such as the Radial Basis function (RBF) network. The core of the conventional EM algorithm for supervised learning of feedforward networks consists of decomposing the observations into their individual output units and then estimating the parameters of each unit separately. In previously proposed approaches, at each E-step the residual is decomposed equally among the units or proportionally to the weights of the output layer. However, this approach tends to slow down the training of networks with local activation units. To overcome this drawback in this paper we use a new E-step which applies a soft decomposition of the residual among the units. Inparticular, the residual is decomposed according to the probability of each RBF unit given each input-output pattern. It is shown that this variant not only speeds up the training in comparison with other EM-type algorithms, but also provides better results than a global gradient-descent technique since it has the capability of avoiding some unwanted minima of the cost function.