• DocumentCode
    735082
  • Title

    Neural-network based online policy iteration for continuous-time infinite-horizon optimal control of nonlinear systems

  • Author

    Difan Tang ; Lei Chen ; Zhao Feng Tian

  • Author_Institution
    Sch. of Mech. Eng., Univ. of Adelaide, Adelaide, SA, Australia
  • fYear
    2015
  • fDate
    12-15 July 2015
  • Firstpage
    792
  • Lastpage
    796
  • Abstract
    A new policy-iteration algorithm based on neural networks (NNs) is proposed in this paper to synthesize optimal control laws online for continuous-time nonlinear systems. Latest advances in this field have enabled synchronous policy iteration but require an additional tuning loop or a logic switch mechanism to maintain system stability. A new algorithm is thus derived in this paper to address this limitation. The optimal control law is found by solving the Hamilton-Jacobi-Bellman (HJB) equation for the associated value function via synchronous policy iteration in a critic-actor configuration. As a major contribution, a new form of NN approximation for the value function is proposed, offering the closed-loop system asymptotic stability without additional tuning scheme or logic switch mechanism. As a second contribution, an extended Kalman filter is introduced to estimate the critic NN parameters for fast convergence. The efficacy of the new algorithm is verified by simulations.
  • Keywords
    Kalman filters; closed loop systems; continuous time systems; control system synthesis; infinite horizon; neurocontrollers; nonlinear control systems; nonlinear filters; optimal control; stability; HJB equation; Hamilton-Jacobi-Bellman equation; NN approximation; NNs; associated value function; closed-loop system asymptotic stability; continuous-time infinite-horizon optimal control; continuous-time nonlinear systems; critic NN parameter estimation; critic-actor configuration; extended Kalman filter; logic switch mechanism; neural networks; online policy iteration; optimal control law synthesis; synchronous policy iteration; system stability; tuning loop; Approximation methods; Decision support systems; Dynamic programming; Markov processes; Radio frequency; Robustness; TV; machine learning; neural network; nonlinear system; optimal control; policy iteration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
  • Conference_Location
    Chengdu
  • Type

    conf

  • DOI
    10.1109/ChinaSIP.2015.7230513
  • Filename
    7230513