DocumentCode :
3565272
Title :
More on training strategies for critic and action neural networks in dual heuristic programming method
Author :
Lendaris, George G. ; Paintz, Christian ; Shannon, Thaddeus
Author_Institution :
Dept. of Syst. Sci. & Electr. Eng., Portland State Univ., OR, USA
Volume :
4
fYear :
1997
Firstpage :
3067
Abstract :
The article describes a modification to the usual procedures for training of critic and action neural networks in the dual heuristic programming (DHP) method (D. Prokhorov and D. Wunsch, 1996; R. Santiago, 1995; P. Werbos, 1994). This modification entails updating both the critic and the action networks at each computational cycle, rather than only one at a time. The distinction lies in the introduction of a (real) second copy of the critic network whose weights are adjusted less often and the “desired value” for training the other critic is obtained from this critic copy. Previously (G. Lendaris and C. Paintz, 1997), the proposed modified training strategy was demonstrated on the pole cart controller problem: the full 6 dimensional state vector was input to the critic and action NNs, however, the utility function only involved pole angle, not distance along the track (x). For the first set of results presented here, the 3 states associated with the x variable were eliminated from the inputs to the NNs, keeping the same utility function previously defined. This resulted in improved learning and controller performance. From this point, the method is applied to two additional problems, each of increasing complexity: for the first, an x-related term is added to the utility function for the pole cart problem, and simultaneously, the x-related states were added back in to the NNs (i.e., increase number of state variables used from 3 to 6); the second relates to steering a vehicle with independent drive motors on each wheel. The problem contexts and experimental results are provided
Keywords :
dynamic programming; heuristic programming; learning (artificial intelligence); motion control; neurocontrollers; vehicles; DHP; computational cycle; controller performance; critic and action neural networks; critic copy; desired value; dual heuristic programming method; independent drive motors; learning; modified training strategy; pole cart controller problem; pole cart problem; state variables; training strategies; utility function; Analytical models; Computer networks; Design methodology; Equations; Intelligent networks; Neural networks; Signal generators; Sliding mode control; Vehicle driving; Wheels;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on
ISSN :
1062-922X
Print_ISBN :
0-7803-4053-1
Type :
conf
DOI :
10.1109/ICSMC.1997.633058
Filename :
633058
Link To Document :
بازگشت