Title :
Training strategies for critic and action neural networks in dual heuristic programming method
Author :
Lendaris, George G. ; Paintz, Christian
Author_Institution :
Portland State Univ., OR, USA
Abstract :
This paper discusses strategies for and details of training procedures for the dual heuristic programming methodology. This and other approximate dynamic programming approaches have been discussed in the literature, all being members of the adaptive critic design family. It suggests and investigates several alternative procedures and compares their performance with respect to convergence speed and quality of resulting controller design. A modification is to introduce a real copy of the criticNN (criticNN 2) for making the “desired output” calculations, and this criticNN 2 is trained differently than is criticNN 1. The idea is to provide the “desired outputs” from a stable platform during an epoch while adapting the criticNN 1. Then at the end of the epoch, criticNN 2 is made identical to the then-current adapted state of criticNN 1, and a new epoch starts. In this way, both the criticNN 1 and the actionNN can be simultaneously trained online during each epoch, with a faster overall convergence than the older approach. The measures used suggest that a “better” controller design (the actionNN) results
Keywords :
convergence of numerical methods; dynamic programming; heuristic programming; learning (artificial intelligence); neurocontrollers; action neural networks; controller design; convergence; critic neural networks; criticNN; dual heuristic programming; training strategies; Analytical models; Backpropagation algorithms; Convergence; Dynamic programming; Equations; Integrated circuit modeling; Intelligent networks; Neural networks; Signal generators; State-space methods;
Conference_Titel :
Neural Networks,1997., International Conference on
Conference_Location :
Houston, TX
Print_ISBN :
0-7803-4122-8
DOI :
10.1109/ICNN.1997.616109