Title :
Objective Functions of Online Weight Noise Injection Training Algorithms for MLPs
Author :
Ho, Kevin ; Leung, Chi-Sing ; Sum, John
Author_Institution :
Dept. of Comput. Sci. & Commun. Eng., Providence Univ., Taichung, Taiwan
Abstract :
Injecting weight noise during training has been a simple strategy to improve the fault tolerance of multilayer perceptrons (MLPs) for almost two decades, and several online training algorithms have been proposed in this regard. However, there are some misconceptions about the objective functions being minimized by these algorithms. Some existing results misinterpret that the prediction error of a trained MLP affected by weight noise is equivalent to the objective function of a weight noise injection algorithm. In this brief, we would like to clarify these misconceptions. Two weight noise injection scenarios will be considered: one is based on additive weight noise injection and the other is based on multiplicative weight noise injection. To avoid the misconceptions, we use their mean updating equations to analyze the objective functions. For injecting additive weight noise during training, we show that the true objective function is identical to the prediction error of a faulty MLP whose weights are affected by additive weight noise. It consists of the conventional mean square error and a smoothing regularizer. For injecting multiplicative weight noise during training, we show that the objective function is different from the prediction error of a faulty MLP whose weights are affected by multiplicative weight noise. With our results, some existing misconceptions regarding MLP training with weight noise injection can now be resolved.
Keywords :
fault tolerance; mean square error methods; multilayer perceptrons; MLP; additive weight noise injection; fault tolerance; mean square error; multilayer perceptrons; multiplicative weight noise injection; objective functions; online weight noise injection training algorithms; smoothing regularizer; Additives; Algorithm design and analysis; Equations; Fault tolerance; Noise; Prediction algorithms; Training; Fault tolerance; prediction error; weight noise injection; Algorithms; Artifacts; Artificial Intelligence; Neural Networks (Computer); Nonlinear Dynamics; Pattern Recognition, Automated; Software Design; Teaching;
Journal_Title :
Neural Networks, IEEE Transactions on
DOI :
10.1109/TNN.2010.2095881