Convergence Analysis of Multiplicative Weight Noise Injection During Training

Author

Ho, Kevin ; Leung, Chi-Sing ; Sum, John ; Lau, Siu-chung

Author_Institution

Dept. of Comput. Sci. & Commun. Eng., Providence Univ., Sha-Lu, Taiwan

fYear

2010

fDate

18-20 Nov. 2010

Firstpage

358

Lastpage

365

Abstract

Injecting weight noise during training has been proposed for almost two decades as a simple technique to improve fault tolerance and generalization of a multilayer perceptron (MLP). However, little has been done regarding their convergence behaviors. Therefore, we presents in this paper the convergence proofs of two of these algorithms for MLPs. One is based on combining injecting multiplicative weight noise and weight decay (MWN-WD) during training. The other is based on combining injecting additive weight noise and weight decay (AWN-WD) during training. Let m be the number of hidden nodes of a MLP, a be the weight decay constant and S_b be the noise variance. It is showed that the convergence of MWN-WD algorithm is with probability one if a >; √(S_b)m. While the convergence of the AWN-WD algorithm is with probability one if a >; 0.

Keywords

fault tolerance; learning (artificial intelligence); multilayer perceptrons; probability; AWN-WD algorithm; MWN-WD algorithm; additive weight noise; convergence analysis; convergence behavior; convergence proof; fault tolerance; multilayer perceptron; multiplicative weight noise injection; noise variance; probability; training; weight decay; MLP; convergence; learning; weight noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Technologies and Applications of Artificial Intelligence (TAAI), 2010 International Conference on

Conference_Location

Hsinchu City

Print_ISBN

978-1-4244-8668-7

Electronic_ISBN

978-0-7695-4253-9

Type

conf

DOI

10.1109/TAAI.2010.64

Filename

5695477