• DocumentCode
    2098889
  • Title

    Asynchronous stochastic approximation and Q-learning

  • Author

    Tsitsiklis, John N.

  • Author_Institution
    Lab. for Inf. & Decision Syst., MIT, Cambridge, MA, USA
  • fYear
    1993
  • fDate
    15-17 Dec 1993
  • Firstpage
    395
  • Abstract
    Provides some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. The author then uses these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establishes its convergence under conditions more general than previously available
  • Keywords
    Markov processes; convergence; decision theory; dynamic programming; learning (artificial intelligence); Markov decision problems; Q-learning; asynchronous stochastic approximation; convergence; reinforcement learning method; Adaptive control; Approximation algorithms; Computational modeling; Convergence; Costs; Dynamic programming; Laboratories; Learning systems; Random variables; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Decision and Control, 1993., Proceedings of the 32nd IEEE Conference on
  • Conference_Location
    San Antonio, TX
  • Print_ISBN
    0-7803-1298-8
  • Type

    conf

  • DOI
    10.1109/CDC.1993.325119
  • Filename
    325119