DocumentCode :
3123317
Title :
Training multilayer perceptron by using optimal input normalization
Author :
Cai, Xun ; Tyagi, Kanishka ; Manry, Michael T.
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
fYear :
2011
fDate :
27-30 June 2011
Firstpage :
2771
Lastpage :
2778
Abstract :
In this paper, we propose a novel second order paradigm called optimal input normalization (OIN) to solve the problems of slow convergence and high complexity of MLP. By optimizing the non-orthogonal transformation matrix of input units in an equivalent network, OIN absorbs separate optimal learning factor for each synaptic weight as well as the threshold of hidden unit, leading to an improvement in the performance for MLP training. Moreover, by using a whitening transformation of negative Jacobian matrix of hidden weights, a modified version of OIN called optimal input normalization with hidden weights optimization (OIN-HWO) is also proposed. The Hessian matrices in both OIN and OIN-HWO are computed by using GaussNewton method. All the linear equations are solved via orthogonal least square (OLS). Regression simulations are performed on several real-life datasets and the results show that the proposed OIN has not only much better convergence rate and generalization ability than output weights optimization-back propagation (OWO-BP), optimal input gains (OIG) and even Levenberg-Marquardt (LM) method, but also takes less computational time than OWO-BP. Although OIN-HWO takes a little expensive computational burden than OIN, its convergence rate is faster than OIN and often close to or rivals LM. It is therefore suggested that OIN-based algorithms are potentially very good choices for practical applications.
Keywords :
Hessian matrices; Jacobian matrices; Newton method; generalisation (artificial intelligence); learning (artificial intelligence); least squares approximations; multilayer perceptrons; regression analysis; Gauss-Newton method; Hessian matrix; Levenberg-Marquardt method; MLP training; convergence rate; generalization ability; hidden weights optimization; linear equation; multilayer perceptron training; negative Jacobian matrix; nonorthogonal transformation matrix; optimal input gain; optimal input normalization; orthogonal least square; output weights optimization-back propagation; regression simulation; Convergence; Equations; Jacobian matrices; Mathematical model; Optimization; Training; Vectors; Gauss-Newton; Schmidt procedure; hidden weights optimization (HWO); multilayer perceptron (MLP); optimal learning factor (OLF); orthogonal least square (OLS); output weights optimization back propagation (OWO-BP);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems (FUZZ), 2011 IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1098-7584
Print_ISBN :
978-1-4244-7315-1
Electronic_ISBN :
1098-7584
Type :
conf
DOI :
10.1109/FUZZY.2011.6007648
Filename :
6007648
Link To Document :
بازگشت