• DocumentCode
    8252
  • Title

    Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning

  • Author

    Qiming Zhao ; Hao Xu ; Jagannathan, Sarangapani

  • Author_Institution
    DENSO Int. America Inc., Southfield, MI, USA
  • Volume
    1
  • Issue
    4
  • fYear
    2014
  • fDate
    Oct. 2014
  • Firstpage
    372
  • Lastpage
    384
  • Abstract
    In this paper, the output feedback based finite-horizon near optimal regulation of nonlinear affine discrete-time systems with unknown system dynamics is considered by using neural networks (NNs) to approximate Hamilton-Jacobi-Bellman (HJB) equation solution. First, a NN-based Luenberger observer is proposed to reconstruct both the system states and the control coefficient matrix. Next, reinforcement learning methodology with actor-critic structure is utilized to approximate the time-varying solution, referred to as the value function, of the HJB equation by using a NN. To properly satisfy the terminal constraint, a new error term is defined and incorporated in the NN update law so that the terminal constraint error is also minimized over time. The NN with constant weights and time-dependent activation function is employed to approximate the time-varying value function which is subsequently utilized to generate the finite-horizon near optimal control policy due to NN reconstruction errors. The proposed scheme functions in a forward-in-time manner without offline training phase. Lyapunov analysis is used to investigate the stability of the overall closed-loop system. Simulation results are given to show the effectiveness and feasibility of the proposed method.
  • Keywords
    Lyapunov methods; closed loop systems; discrete time systems; feedback; learning (artificial intelligence); matrix algebra; neurocontrollers; nonlinear systems; observers; optimal control; partial differential equations; stability; time-varying systems; transfer functions; HJB equation solution; Hamilton-Jacobi-Bellman equation solution; Lyapunov analysis; NN reconstruction errors; NN update law; NN-based Luenberger observer; actor-critic structure; closed-loop system stability; control coefficient matrix; error term; finite-horizon near optimal control policy; near optimal output feedback control; nonlinear affine discrete-time systems; output feedback based finite-horizon near optimal regulation; reinforcement neural network learning; system state reonstruction; terminal constraint; time-dependent activation function; time-varying solution; time-varying value function; unknown system dynamics; value function; Approximation methods; Artificial neural networks; Feedback; Learning (artificial intelligence); Nonlinear dynamical systems; Observers; Optimal control; Finite-horizon; Hamilton-Jacobi-Bellman equation; approximate dynamic programming; neural network; optimal regulation;
  • fLanguage
    English
  • Journal_Title
    Automatica Sinica, IEEE/CAA Journal of
  • Publisher
    ieee
  • ISSN
    2329-9266
  • Type

    jour

  • DOI
    10.1109/JAS.2014.7004665
  • Filename
    7004665