Asynchronous stochastic approximation and Q-learning

Author

Tsitsiklis, John N.

Author_Institution

Lab. for Inf. & Decision Syst., MIT, Cambridge, MA, USA

fYear

1993

fDate

15-17 Dec 1993

Firstpage

395

Abstract

Provides some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. The author then uses these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establishes its convergence under conditions more general than previously available

Keywords

Markov processes; convergence; decision theory; dynamic programming; learning (artificial intelligence); Markov decision problems; Q-learning; asynchronous stochastic approximation; convergence; reinforcement learning method; Adaptive control; Approximation algorithms; Computational modeling; Convergence; Costs; Dynamic programming; Laboratories; Learning systems; Random variables; Stochastic processes;

fLanguage

English

Publisher

ieee

Conference_Titel

Decision and Control, 1993., Proceedings of the 32nd IEEE Conference on

Conference_Location

San Antonio, TX

Print_ISBN

0-7803-1298-8

Type

conf

DOI

10.1109/CDC.1993.325119

Filename

325119

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=2098889