مرکز منطقه ای اطلاع رساني علوم و فناوري - More on training strategies for critic and action neural networks in dual heuristic programming method

DocumentCode :

3565272

Title :

More on training strategies for critic and action neural networks in dual heuristic programming method

Author :

Lendaris, George G. ; Paintz, Christian ; Shannon, Thaddeus

Author_Institution :

Dept. of Syst. Sci. & Electr. Eng., Portland State Univ., OR, USA

Volume :

fYear :

1997

Firstpage :

3067

Abstract :

The article describes a modification to the usual procedures for training of critic and action neural networks in the dual heuristic programming (DHP) method (D. Prokhorov and D. Wunsch, 1996; R. Santiago, 1995; P. Werbos, 1994). This modification entails updating both the critic and the action networks at each computational cycle, rather than only one at a time. The distinction lies in the introduction of a (real) second copy of the critic network whose weights are adjusted less often and the “desired value” for training the other critic is obtained from this critic copy. Previously (G. Lendaris and C. Paintz, 1997), the proposed modified training strategy was demonstrated on the pole cart controller problem: the full 6 dimensional state vector was input to the critic and action NNs, however, the utility function only involved pole angle, not distance along the track (x). For the first set of results presented here, the 3 states associated with the x variable were eliminated from the inputs to the NNs, keeping the same utility function previously defined. This resulted in improved learning and controller performance. From this point, the method is applied to two additional problems, each of increasing complexity: for the first, an x-related term is added to the utility function for the pole cart problem, and simultaneously, the x-related states were added back in to the NNs (i.e., increase number of state variables used from 3 to 6); the second relates to steering a vehicle with independent drive motors on each wheel. The problem contexts and experimental results are provided

Keywords :

dynamic programming; heuristic programming; learning (artificial intelligence); motion control; neurocontrollers; vehicles; DHP; computational cycle; controller performance; critic and action neural networks; critic copy; desired value; dual heuristic programming method; independent drive motors; learning; modified training strategy; pole cart controller problem; pole cart problem; state variables; training strategies; utility function; Analytical models; Computer networks; Design methodology; Equations; Intelligent networks; Neural networks; Signal generators; Sliding mode control; Vehicle driving; Wheels;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on

ISSN :

1062-922X

Print_ISBN :

0-7803-4053-1

Type :

conf

DOI :

10.1109/ICSMC.1997.633058

Filename :

633058

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3565272