مرکز منطقه ای اطلاع رساني علوم و فناوري - On the mean-square rate of convergence of temporal-difference learning algorithms

DocumentCode :

3613838

Title :

On the mean-square rate of convergence of temporal-difference learning algorithms

Author :

V.B. Tadic

Author_Institution :

Dept. of Electr. & Electron. Eng., Melbourne Univ., Parkville, Vic., Australia

Volume :

fYear :

2002

fDate :

6/24/1905 12:00:00 AM

Firstpage :

1454

Abstract :

In this paper, the mean-square rate of convergence of temporal-difference learning algorithms is analyzed. The analysis is carried out for the case of discounted cost function associated with a Markov chain with a finite dimensional state-space. Under mild conditions, it is shown that these algorithms converge at the rate O(n/sup -1/2/). The results are illustrated with examples related to random coefficient autoregression models and M/G/1 queues.

Keywords :

"Convergence","Cost function","Function approximation","Algorithm design and analysis","Automatic control","Approximation error","Stochastic processes","Predictive models","Performance analysis","Australia Council"

Publisher :

ieee

Conference_Titel :

American Control Conference, 2002. Proceedings of the 2002

ISSN :

0743-1619

Print_ISBN :

0-7803-7298-0

Type :

conf

DOI :

10.1109/ACC.2002.1023226

Filename :

1023226

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3613838