DocumentCode
3782285
Title
A mean-square asymptotic analysis of a constant stepsize temporal-difference learning algorithm
Author
V. Tadic
Author_Institution
Mihajlo Pupin Inst., Belgrade, Serbia
Volume
4
fYear
1999
Firstpage
2862
Abstract
The mean-square asymptotic behavior of constant stepsize temporal-difference algorithms is analyzed. The analysis is carried out for the case of a linear (cost-to-go) function approximation and for the case of Markov chains with an uncountable state space. An asymptotic upper bound for the mean-square deviation of the algorithm iterations from the optimal value of the parameter of the (cost-to-go) function approximator achievable by temporal-difference learning is determined as a function of stepsize.
Keywords
"Algorithm design and analysis","Function approximation","Approximation algorithms","State-space methods","Upper bound","Convergence","Stochastic systems","Predictive models","Control system analysis","Time series analysis"
Publisher
ieee
Conference_Titel
American Control Conference, 1999. Proceedings of the 1999
ISSN
0743-1619
Print_ISBN
0-7803-4990-3
Type
conf
DOI
10.1109/ACC.1999.786595
Filename
786595
Link To Document