• DocumentCode
    3217549
  • Title

    Optimal adaptive control for unknown systems using output feedback by reinforcement learning methods

  • Author

    Lewis, F.L. ; Vamvoudakis, Kyriakos G.

  • Author_Institution
    Autom. & Robot. Res. Inst., Univ. of Texas at Arlington, Fort Worth, TX, USA
  • fYear
    2010
  • fDate
    9-11 June 2010
  • Firstpage
    2138
  • Lastpage
    2145
  • Abstract
    Optimal feedback controllers are generally computed offline assuming full knowledge of the system dynamics. Adaptive controllers, on the other hand, are online schemes that effectively learn to compensate for unknown system dynamics and disturbances. Generally, direct adaptive schemes do not converge to optimal control solutions for user-prescribed performance measures. During the past years, it has been shown that reinforcement learning techniques from computational intelligence can be used to learn optimal feedback controllers online using direct adaptive control techniques without knowing the system dynamics. Most reinforcement learning methods require full measurements of the system internal state. In this paper we develop reinforcement learning methods which require only output feedback and yet converge to an optimal controller. Deterministic linear time-invariant systems are considered. Both policy iteration (PI) and value iteration (VI) algorithms are derived. This corresponds to optimal control for a class of partially observable Markov decision processes (POMDPs). It is shown that, similar to Q-learning, the new output-feedback optimal learning methods have the important advantage that knowledge of the system dynamics is not needed for their implementation. Only the order of the system must be known and an upper bound on its ‘observability index’. The learned output feedback controller is in the form of a polynomial ARMA controller that has equivalent performance with the optimal state variable feedback gain.
  • Keywords
    Adaptive control; Computational intelligence; Control systems; Learning systems; Optimal control; Output feedback; Polynomials; Programmable control; State feedback; Upper bound; Output feedback Approximate Dynamic Programming; Policy Iteration; Value Iteration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control and Automation (ICCA), 2010 8th IEEE International Conference on
  • Conference_Location
    Xiamen, China
  • ISSN
    1948-3449
  • Print_ISBN
    978-1-4244-5195-1
  • Electronic_ISBN
    1948-3449
  • Type

    conf

  • DOI
    10.1109/ICCA.2010.5524211
  • Filename
    5524211