مرکز منطقه ای اطلاع رساني علوم و فناوري - A mean-square asymptotic analysis of a constant stepsize temporal-difference learning algorithm

DocumentCode :

3782285

Title :

A mean-square asymptotic analysis of a constant stepsize temporal-difference learning algorithm

Author :

V. Tadic

Author_Institution :

Mihajlo Pupin Inst., Belgrade, Serbia

Volume :

fYear :

1999

Firstpage :

2862

Abstract :

The mean-square asymptotic behavior of constant stepsize temporal-difference algorithms is analyzed. The analysis is carried out for the case of a linear (cost-to-go) function approximation and for the case of Markov chains with an uncountable state space. An asymptotic upper bound for the mean-square deviation of the algorithm iterations from the optimal value of the parameter of the (cost-to-go) function approximator achievable by temporal-difference learning is determined as a function of stepsize.

Keywords :

"Algorithm design and analysis","Function approximation","Approximation algorithms","State-space methods","Upper bound","Convergence","Stochastic systems","Predictive models","Control system analysis","Time series analysis"

Publisher :

ieee

Conference_Titel :

American Control Conference, 1999. Proceedings of the 1999

ISSN :

0743-1619

Print_ISBN :

0-7803-4990-3

Type :

conf

DOI :

10.1109/ACC.1999.786595

Filename :

786595

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3782285