DocumentCode
288378
Title
Scaling properties of on-line learning with momentum
Author
Heskes, Tom ; Wiegerinck, Wim ; Komoda, Andrzej
Author_Institution
Beckman Inst. for Adv. Sci. & Technol., Illinois Univ., Urbana, IL, USA
Volume
1
fYear
1994
fDate
27 Jun-2 Jul 1994
Firstpage
508
Abstract
We study online learning with momentum term for nonlinear learning rules. Through introduction of auxiliary variables, we show that the learning process can still be described by a first-order Markov process. For small learning parameters η and momentum parameters α close to 1 (we consider the case α=1-√(η/λ) for small η), Van Kampen´s expansion can be applied in a straightforward manner. We obtain evolution equations for the average network state and the fluctuations around this average. These evolution equations depend (after rescaling of time and fluctuations) only on λ=η/(1-α)2: all combinations (η,α) with the same value of λ give rise to similar graphs. For small λ, i.e., η≪(1-α)2, learning with momentum term is equivalent to learning without momentum term with rescaled learning parameter η˜=η/(1-α). Simulations with the nonlinear Oja learning rule confirm our theoretical results
Keywords
Markov processes; learning (artificial intelligence); neural nets; real-time systems; Markov process; Oja learning rule; evolution equations; momentum parameter; neural nets; nonlinear learning rules; online learning; scaling properties; Backpropagation algorithms; Biophysics; Differential equations; Fluctuations; Least squares approximation; Markov processes; Nonlinear equations; Physics; Stochastic processes; Time measurement;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on
Conference_Location
Orlando, FL
Print_ISBN
0-7803-1901-X
Type
conf
DOI
10.1109/ICNN.1994.374215
Filename
374215
Link To Document