• DocumentCode
    25423
  • Title

    Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

  • Author

    Derong Liu ; Qinglai Wei

  • Author_Institution
    State Key Lab. of Manage. & Control for Complex Syst., Inst. of Autom., Beijing, China
  • Volume
    25
  • Issue
    3
  • fYear
    2014
  • fDate
    Mar-14
  • Firstpage
    621
  • Lastpage
    634
  • Abstract
    This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterative ADP technique to obtain the iterative control law, which optimizes the iterative performance index function. The main contribution of this paper is to analyze the convergence and stability properties of policy iteration method for discrete-time nonlinear systems for the first time. It shows that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation. It is also proven that any of the iterative control laws can stabilize the nonlinear systems. Neural networks are used to approximate the performance index function and compute the optimal control law, respectively, for facilitating the implementation of the iterative ADP algorithm, where the convergence of the weight matrices is analyzed. Finally, the numerical results and analysis are presented to illustrate the performance of the developed method.
  • Keywords
    adaptive control; convergence; discrete time systems; dynamic programming; infinite horizon; iterative methods; matrix algebra; neurocontrollers; nonlinear control systems; optimal control; performance index; stability; ADP method; Hamilton-Jacobi-Bellman equation; convergence; discrete-time nonlinear systems; discrete-time policy iteration adaptive dynamic programming algorithm; infinite horizon optimal control problem; iterative ADP algorithm; iterative ADP technique; iterative control law; iterative performance index function; neural networks; optimal control law; optimal solution; policy iteration method; stability properties; stabilization; weight matrices; Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; discrete-time policy iteration; neural networks; neurodynamic programming; nonlinear systems; optimal control; reinforcement learning;
  • fLanguage
    English
  • Journal_Title
    Neural Networks and Learning Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2162-237X
  • Type

    jour

  • DOI
    10.1109/TNNLS.2013.2281663
  • Filename
    6609085