Title :
On the Weight Convergence of Elman Networks
Author_Institution :
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
fDate :
3/1/2010 12:00:00 AM
Abstract :
An Elman network (EN) can be viewed as a feedforward (FF) neural network with an additional set of inputs from the context layer (feedback from the hidden layer). Therefore, instead of the offline backpropagation-through-time (BPTT) algorithm, a standard online (real-time) backpropagation (BP) algorithm, usually called Elman BP (EBP), can be applied for EN training for discrete-time sequence predictions. However, the standard BP training algorithm is not the most suitable for ENs. A low learning rate can improve the training of ENs but can also result in very slow convergence speeds and poor generalization performance, whereas a high learning rate can lead to unstable training in terms of weight divergence. Therefore, an optimal or suboptimal tradeoff between training speed and weight convergence with good generalization capability is desired for ENs. This paper develops a robust extended EBP (eEBP) training algorithm for ENs with a new adaptive dead zone scheme based on eEBP training concepts. The adaptive learning rate and adaptive dead zone optimize the training of ENs for each individual output and improve the generalization performance of the eEBP training. In particular, for the proposed eEBP training algorithm, convergence of the ENs´ weights with the adaptive dead zone estimates is proven in the sense of Lyapunov functions. Computer simulations are carried out to demonstrate the improved performance of eEBP for discrete-time sequence predictions.
Keywords :
Lyapunov methods; adaptive systems; backpropagation; convergence; feedforward neural nets; generalisation (artificial intelligence); optimisation; Elman network; Lyapunov function; adaptive dead zone; adaptive learning rate; backpropagation through time algorithm; context layer; discrete time sequence prediction; extended Elman backpropagation training; feedforward neural network; generalization; weight convergence; weight divergence; Elman networks (ENs); extended Elman backpropagation (eEBP) training; generalization; weight convergence; Adaptation, Physiological; Algorithms; Artificial Intelligence; Computer Simulation; Feedback; Generalization (Psychology); Humans; Neural Networks (Computer); Pattern Recognition, Automated; Predictive Value of Tests; Time Factors;
Journal_Title :
Neural Networks, IEEE Transactions on
DOI :
10.1109/TNN.2009.2039226