Using the EM algorithm to train neural networks: misconceptions and a new algorithm for multiclass classification

Authors:
Shu-Kay Ng;G. J. McLachlan
Affiliations:
Dept. of Math., Univ. of Queensland, Brisbane, Qld., Australia;-
Venue:
IEEE Transactions on Neural Networks
Year:
2004

Citing 0
Cited 9

Probabilistic based recursive model for adaptive processing of data structures

Expert Systems with Applications: An International Journal
Applying effective feature selection techniques with hierarchical mixtures of experts for spam classification

Journal of Computer Security
Location management scheme with WLAN positioning algorithm for integrated wireless networks

Computer Communications
Applying effective feature selection techniques with hierarchical mixtures of experts for spam classification

Journal of Computer Security - Best papers of the Sec Track at the 2006 ACM Symposium
A Single Loop EM Algorithm for the Mixture of Experts Architecture

ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part II
An incremental EM-based learning approach for on-line prediction of hospital resource utilization

Artificial Intelligence in Medicine
Asymptotic convergence properties of the em algorithm for mixture of experts

Neural Computation
Normalized gaussian networks with mixed feature data

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Mixtures of regressions with changepoints

Statistics and Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.