مرکز منطقه ای اطلاع رساني علوم و فناوري - Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems

DocumentCode :

669407

Title :

Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems

Author :

Tae Yoon Chun ; Jin Bae Park ; Yoon Ho Choi

Author_Institution :

Dept. of Electr. Eng., Yonsei Univ., Seoul, South Korea

fYear :

2013

fDate :

20-23 Oct. 2013

Firstpage :

454

Lastpage :

458

Abstract :

This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of GPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of GPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.

Keywords :

Lyapunov methods; discrete time systems; dynamic programming; learning (artificial intelligence); optimal control; stability; GPI; Lyapunov approach; PI-mode monotone convergence; discrete-time linear system; dynamic programming; generalized policy iteration; optimal control problem; policy iteration-mode monotone convergence; reinforcement learning; stability property; Approximation algorithms; Education; Stability analysis; generalized policy iteration; linear quadratic regulator; policy iteration-mode monotone convergence;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Control, Automation and Systems (ICCAS), 2013 13th International Conference on

Conference_Location :

Gwangju

ISSN :

2093-7121

Print_ISBN :

978-89-93215-05-2

Type :

conf

DOI :

10.1109/ICCAS.2013.6703973

Filename :

6703973

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=669407