مرکز منطقه ای اطلاع رساني علوم و فناوري - Online Synchronous Policy Iteration based on Concurrent Learning to solve continuous-time optimal control problem

DocumentCode :

3667471

Title :

Online Synchronous Policy Iteration based on Concurrent Learning to solve continuous-time optimal control problem

Author :

Haitao Wang;Dongbin Zhao;Chengdong Li

Author_Institution :

The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

fYear :

2015

fDate :

4/1/2015 12:00:00 AM

Firstpage :

297

Lastpage :

302

Abstract :

In this paper, a novel online Synchronous Policy Iteration algorithm based on Concurrent Learning (CLSPI) is presented to solve the continuous-time optimal control problem. We design this algorithm based on actor-critic architecture. Original Synchronous Policy Iteration (SPI) algorithm just updates parameters of actor-critic simultaneously, which is rather different to the way standard policy iteration updates. In the scheme of SPI, only current estimation error is utilized to update weights while previous information can also contribute to weights update. Concurrent learning is a new parameters estimation method which combines previous information and current estimation error. CLSPI utilizes concurrent learning to train SPI, in order to improve the learning performance. Finally, two comparison experiments including linear and nonlinear systems are presented to demonstrate that CLSPI can obtain the optimal control policy with faster convergence rate, which shows that CLSPI is more efficient to solve continuous-time optimal control problems.

Keywords :

"Artificial neural networks","Computational intelligence"

Publisher :

ieee

Conference_Titel :

Information Science and Technology (ICIST), 2015 5th International Conference on

Type :

conf

DOI :

10.1109/ICIST.2015.7288986

Filename :

7288986

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3667471