Title :
Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems
Author :
Tae Yoon Chun ; Jin Bae Park ; Yoon Ho Choi
Author_Institution :
Dept. of Electr. Eng., Yonsei Univ., Seoul, South Korea
Abstract :
This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of GPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of GPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.
Keywords :
Lyapunov methods; discrete time systems; dynamic programming; learning (artificial intelligence); optimal control; stability; GPI; Lyapunov approach; PI-mode monotone convergence; discrete-time linear system; dynamic programming; generalized policy iteration; optimal control problem; policy iteration-mode monotone convergence; reinforcement learning; stability property; Approximation algorithms; Education; Stability analysis; generalized policy iteration; linear quadratic regulator; policy iteration-mode monotone convergence;
Conference_Titel :
Control, Automation and Systems (ICCAS), 2013 13th International Conference on
Conference_Location :
Gwangju
Print_ISBN :
978-89-93215-05-2
DOI :
10.1109/ICCAS.2013.6703973