DocumentCode :
3782285
Title :
A mean-square asymptotic analysis of a constant stepsize temporal-difference learning algorithm
Author :
V. Tadic
Author_Institution :
Mihajlo Pupin Inst., Belgrade, Serbia
Volume :
4
fYear :
1999
Firstpage :
2862
Abstract :
The mean-square asymptotic behavior of constant stepsize temporal-difference algorithms is analyzed. The analysis is carried out for the case of a linear (cost-to-go) function approximation and for the case of Markov chains with an uncountable state space. An asymptotic upper bound for the mean-square deviation of the algorithm iterations from the optimal value of the parameter of the (cost-to-go) function approximator achievable by temporal-difference learning is determined as a function of stepsize.
Keywords :
"Algorithm design and analysis","Function approximation","Approximation algorithms","State-space methods","Upper bound","Convergence","Stochastic systems","Predictive models","Control system analysis","Time series analysis"
Publisher :
ieee
Conference_Titel :
American Control Conference, 1999. Proceedings of the 1999
ISSN :
0743-1619
Print_ISBN :
0-7803-4990-3
Type :
conf
DOI :
10.1109/ACC.1999.786595
Filename :
786595
Link To Document :
بازگشت