Title :
Training multilayer perceptron by using optimal input normalization
Author :
Cai, Xun ; Tyagi, Kanishka ; Manry, Michael T.
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
Abstract :
In this paper, we propose a novel second order paradigm called optimal input normalization (OIN) to solve the problems of slow convergence and high complexity of MLP. By optimizing the non-orthogonal transformation matrix of input units in an equivalent network, OIN absorbs separate optimal learning factor for each synaptic weight as well as the threshold of hidden unit, leading to an improvement in the performance for MLP training. Moreover, by using a whitening transformation of negative Jacobian matrix of hidden weights, a modified version of OIN called optimal input normalization with hidden weights optimization (OIN-HWO) is also proposed. The Hessian matrices in both OIN and OIN-HWO are computed by using GaussNewton method. All the linear equations are solved via orthogonal least square (OLS). Regression simulations are performed on several real-life datasets and the results show that the proposed OIN has not only much better convergence rate and generalization ability than output weights optimization-back propagation (OWO-BP), optimal input gains (OIG) and even Levenberg-Marquardt (LM) method, but also takes less computational time than OWO-BP. Although OIN-HWO takes a little expensive computational burden than OIN, its convergence rate is faster than OIN and often close to or rivals LM. It is therefore suggested that OIN-based algorithms are potentially very good choices for practical applications.
Keywords :
Hessian matrices; Jacobian matrices; Newton method; generalisation (artificial intelligence); learning (artificial intelligence); least squares approximations; multilayer perceptrons; regression analysis; Gauss-Newton method; Hessian matrix; Levenberg-Marquardt method; MLP training; convergence rate; generalization ability; hidden weights optimization; linear equation; multilayer perceptron training; negative Jacobian matrix; nonorthogonal transformation matrix; optimal input gain; optimal input normalization; orthogonal least square; output weights optimization-back propagation; regression simulation; Convergence; Equations; Jacobian matrices; Mathematical model; Optimization; Training; Vectors; Gauss-Newton; Schmidt procedure; hidden weights optimization (HWO); multilayer perceptron (MLP); optimal learning factor (OLF); orthogonal least square (OLS); output weights optimization back propagation (OWO-BP);
Conference_Titel :
Fuzzy Systems (FUZZ), 2011 IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-7315-1
Electronic_ISBN :
1098-7584
DOI :
10.1109/FUZZY.2011.6007648