مرکز منطقه ای اطلاع رساني علوم و فناوري - Using the EM algorithm to train neural networks: misconceptions and a new algorithm for multiclass classification

DocumentCode :

980549

Title :

Using the EM algorithm to train neural networks: misconceptions and a new algorithm for multiclass classification

Author :

Ng, Shu-Kay ; McLachlan, Geoffrey John

Author_Institution :

Dept. of Math., Univ. of Queensland, Brisbane, Qld., Australia

Volume :

Issue :

fYear :

2004

fDate :

5/1/2004 12:00:00 AM

Firstpage :

738

Lastpage :

749

Abstract :

The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.

Keywords :

learning (artificial intelligence); least squares approximations; multilayer perceptrons; neural nets; optimisation; variational techniques; MLP networks; expectation-maximization algorithm; expert mixtures; iteratively reweighted least squares; log likelihood; multiclass classification; multilayer perceptron training; neural networks training; variational relaxation; Adaptive systems; Classification algorithms; Electrochemical machining; Iterative algorithms; Jacobian matrices; Least squares approximation; Multilayer perceptrons; Neural networks; Signal processing algorithms; Stochastic processes; Algorithms; Neural Networks (Computer);

fLanguage :

English

Journal_Title :

Neural Networks, IEEE Transactions on

Publisher :

ieee

ISSN :

1045-9227

Type :

jour

DOI :

10.1109/TNN.2004.826217

Filename :

1296699

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=980549