مرکز منطقه ای اطلاع رساني علوم و فناوري - Solutions to finite horizon cost problems using actor-critic reinforcement learning

DocumentCode :

671416

Title :

Solutions to finite horizon cost problems using actor-critic reinforcement learning

Author :

Grondman, Ivo ; Hao Xu ; Jagannathan, Sarangapani ; Babuska, Robert

Author_Institution :

Delft Center for Syst. & Control, Delft Univ. of Technol., Delft, Netherlands

fYear :

2013

fDate :

4-9 Aug. 2013

Firstpage :

Lastpage :

Abstract :

Actor-critic reinforcement learning algorithms have shown to be a successful tool in learning the optimal control for a range of (repetitive) tasks on systems with (partially) unknown dynamics, which may or may not be nonlinear. Most of the reinforcement learning literature published up to this point only deals with modeling the task at hand as a Markov decision process with an infinite horizon cost function. In practice, however, it is sometimes desired to have a solution for the case where the cost function is defined over a finite horizon, which means that the optimal control problem will be time-varying and thus harder to solve. This paper adapts two previously introduced actor-critic algorithms from the infinite horizon setting to the finite horizon setting and applies them to learning a task on a nonlinear system, without needing any assumptions or knowledge about the system dynamics, using radial basis function networks. Simulations on a typical nonlinear motion control problem are carried out, showing that actor-critic algorithms are capable of solving the difficult problem of time-varying optimal control. Moreover, the benefit of using a model learning technique is shown.

Keywords :

Markov processes; learning (artificial intelligence); motion control; nonlinear control systems; optimal control; radial basis function networks; time-varying systems; Markov decision process; actor-critic reinforcement learning algorithm; finite horizon cost problems; nonlinear motion control problem; nonlinear system; radial basis function networks; time varying optimal control problem; Approximation algorithms; Equations; Heuristic algorithms; Learning (artificial intelligence); Markov processes; Mathematical model; Optimal control;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), The 2013 International Joint Conference on

Conference_Location :

Dallas, TX

ISSN :

2161-4393

Print_ISBN :

978-1-4673-6128-6

Type :

conf

DOI :

10.1109/IJCNN.2013.6706755

Filename :

6706755

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=671416