• DocumentCode
    2105996
  • Title

    A modified actor-critic reinforcement learning algorithm

  • Author

    Mustapha, Sidi M. ; Lachiver, Gerard

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Eng., Sherbrooke Univ., Que., Canada
  • Volume
    2
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    605
  • Abstract
    This paper proposes a fast and efficient actor-critic reinforcement learning algorithm that is novel in at least two ways: it updates the critic only when the best action is executed and it takes full advantage of the powerful temporal difference (TD) prediction method to train a continuous-valued actor. Both actor and critic are represented separately by two adaptive neural fuzzy systems tuned by a backpropagation algorithm. While the critic adapts to the actor by minimizing the quadratic sum of TD error, the actor adapts to the critic, by not only using the TD error, but also by using the state value function. The new actor-critic architecture is applied to an inverted pendulum system, which is widely used to compare reinforcement learning architectures
  • Keywords
    adaptive systems; backpropagation; computational complexity; fuzzy neural nets; minimisation; temporal reasoning; TD prediction method; adaptive neural fuzzy systems; backpropagation algorithm; continuous-valued actor training; inverted pendulum system; modified actor-critic reinforcement learning algorithm; quadratic sum minimization; reinforcement learning architectures; state value function; temporal difference prediction method; Adaptive systems; Backpropagation algorithms; Delay; Fuzzy systems; Learning systems; Power engineering and energy; Power engineering computing; Prediction methods; State estimation; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical and Computer Engineering, 2000 Canadian Conference on
  • Conference_Location
    Halifax, NS
  • ISSN
    0840-7789
  • Print_ISBN
    0-7803-5957-7
  • Type

    conf

  • DOI
    10.1109/CCECE.2000.849537
  • Filename
    849537