Title :
Generalized Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems
Author :
Derong Liu ; Qinglai Wei ; Pengfei Yan
Author_Institution :
State Key Lab. of Manage. & Control for Complex Syst., Inst. of Autom., Beijing, China
Abstract :
This paper is concerned with a novel generalized policy iteration algorithm for solving optimal control problems for discrete-time nonlinear systems. The idea is to use an iterative adaptive dynamic programming algorithm to obtain iterative control laws which make the iterative value functions converge to the optimum. Initialized by an admissible control law, it is shown that the iterative value functions are monotonically nonincreasing and converge to the optimal solution of Hamilton-Jacobi-Bellman equation, under the assumption that a perfect function approximation is employed. The admissibility property is analyzed, which shows that any of the iterative control laws can stabilize the nonlinear system. Neural networks are utilized to implement the generalized policy iteration algorithm, by approximating the iterative value function and computing the iterative control law, respectively, to achieve approximate optimal control. Finally, numerical examples are presented to verify the effectiveness of the present generalized policy iteration algorithm.
Keywords :
adaptive control; discrete time systems; dynamic programming; function approximation; iterative methods; neurocontrollers; nonlinear control systems; optimal control; stability; Hamilton-Jacobi-Bellman equation; admissibility property; admissible control law; discrete-time system; function approximation; generalized policy iteration; iterative adaptive dynamic programming; iterative control law; iterative value function; neural networks; nonlinear system; optimal control; Dynamic programming; Learning (artificial intelligence); Neural networks; Nonlinear systems; Optimal control; Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; generalized policy iteration; neural networks; neuro-dynamic programming; nonlinear systems; optimal control; reinforcement learning; reinforcement learning.;
Journal_Title :
Systems, Man, and Cybernetics: Systems, IEEE Transactions on
DOI :
10.1109/TSMC.2015.2417510