• DocumentCode
    7313
  • Title

    Revisiting Approximate Dynamic Programming and its Convergence

  • Author

    Heydari, Ali

  • Author_Institution
    Dept. of Mech. Eng., South Dakota Sch. of Mines & Technol., Rapid City, SD, USA
  • Volume
    44
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    2733
  • Lastpage
    2743
  • Abstract
    Value iteration-based approximate/adaptive dynamic programming (ADP) as an approximate solution to infinite-horizon optimal control problems with deterministic dynamics and continuous state and action spaces is investigated. The learning iterations are decomposed into an outer loop and an inner loop. A relatively simple proof for the convergence of the outer-loop iterations to the optimal solution is provided using a novel idea with some new features. It presents an analogy between the value function during the iterations and the value function of a fixed-final-time optimal control problem. The inner loop is utilized to avoid the need for solving a set of nonlinear equations or a nonlinear optimization problem numerically, at each iteration of ADP for the policy update. Sufficient conditions for the uniqueness of the solution to the policy update equation and for the convergence of the inner-loop iterations to the solution are obtained. Afterwards, the results are formed as a learning algorithm for training a neurocontroller or creating a look-up table to be used for optimal control of nonlinear systems with different initial conditions. Finally, some of the features of the investigated method are numerically analyzed.
  • Keywords
    dynamic programming; infinite horizon; iterative methods; learning systems; neurocontrollers; nonlinear control systems; nonlinear equations; optimal control; table lookup; ADP; action spaces; adaptive dynamic programming; approximate dynamic programming; continuous state spaces; convergence; deterministic dynamics; fixed-final-time optimal control problem; infinite-horizon optimal control problems; learning iterations; look-up table; neurocontroller training; nonlinear equations; nonlinear optimization problem; nonlinear systems; outer-loop iterations; policy update equation; value function; value iteration-based approximate; Approximation methods; Convergence; Dynamic programming; Equations; Mathematical model; Optimal control; Vectors; Approximate dynamic programming; nonlinear control systems; optimal control;
  • fLanguage
    English
  • Journal_Title
    Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-2267
  • Type

    jour

  • DOI
    10.1109/TCYB.2014.2314612
  • Filename
    6815973