DocumentCode
2098889
Title
Asynchronous stochastic approximation and Q-learning
Author
Tsitsiklis, John N.
Author_Institution
Lab. for Inf. & Decision Syst., MIT, Cambridge, MA, USA
fYear
1993
fDate
15-17 Dec 1993
Firstpage
395
Abstract
Provides some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. The author then uses these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establishes its convergence under conditions more general than previously available
Keywords
Markov processes; convergence; decision theory; dynamic programming; learning (artificial intelligence); Markov decision problems; Q-learning; asynchronous stochastic approximation; convergence; reinforcement learning method; Adaptive control; Approximation algorithms; Computational modeling; Convergence; Costs; Dynamic programming; Laboratories; Learning systems; Random variables; Stochastic processes;
fLanguage
English
Publisher
ieee
Conference_Titel
Decision and Control, 1993., Proceedings of the 32nd IEEE Conference on
Conference_Location
San Antonio, TX
Print_ISBN
0-7803-1298-8
Type
conf
DOI
10.1109/CDC.1993.325119
Filename
325119
Link To Document