• DocumentCode
    2910653
  • Title

    On integral value iteration for continuous-time linear systems

  • Author

    Jae Young Lee ; Jin Bae Park ; Yoon Ho Choi

  • Author_Institution
    Dept. of Electr. & Electron. Eng., Yonsei Univ., Seoul, South Korea
  • fYear
    2013
  • fDate
    17-19 June 2013
  • Firstpage
    4215
  • Lastpage
    4220
  • Abstract
    This paper investigates the properties of integral value iteration (I-VI) which is one of the reinforcement learning (RL) technique for solving online the continuous-time (CT) optimal control problems without using the system drift dynamics. The target I-VI is the one applied to CT linear quadratic regulation problems. As a result, two modes of global monotone convergence of I-VI are presented. One behaves like policy iteration (PI) (PI-mode of convergence) and the other is named VI-mode of convergence. All of the other properties-positive definiteness, stability, and relation between I-VI and integral PI - are presented within these two frameworks. Finally, numerical simulations are carried out to verify and further investigate these properties.
  • Keywords
    continuous time systems; iterative methods; learning (artificial intelligence); linear systems; optimal control; stability; CT linear quadratic regulation; CT optimal control; I-VI-PI relation property; RL technique; continuous-time linear system; continuous-time optimal control; integral value iteration; numerical simulation; policy iteration; positive definiteness property; reinforcement learning technique; stability property; system drift dynamics; Convergence; DC motors; Heuristic algorithms; Numerical stability; Riccati equations; Stability analysis; LQR; approximate dynamic programming; monotone convergence; reinforcement learning; value iteration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    American Control Conference (ACC), 2013
  • Conference_Location
    Washington, DC
  • ISSN
    0743-1619
  • Print_ISBN
    978-1-4799-0177-7
  • Type

    conf

  • DOI
    10.1109/ACC.2013.6580487
  • Filename
    6580487