Title :
A mean-square asymptotic analysis of a constant stepsize temporal-difference learning algorithm
Author_Institution :
Mihajlo Pupin Inst., Belgrade, Serbia
Abstract :
The mean-square asymptotic behavior of constant stepsize temporal-difference algorithms is analyzed. The analysis is carried out for the case of a linear (cost-to-go) function approximation and for the case of Markov chains with an uncountable state space. An asymptotic upper bound for the mean-square deviation of the algorithm iterations from the optimal value of the parameter of the (cost-to-go) function approximator achievable by temporal-difference learning is determined as a function of stepsize.
Keywords :
"Algorithm design and analysis","Function approximation","Approximation algorithms","State-space methods","Upper bound","Convergence","Stochastic systems","Predictive models","Control system analysis","Time series analysis"
Conference_Titel :
American Control Conference, 1999. Proceedings of the 1999
Print_ISBN :
0-7803-4990-3
DOI :
10.1109/ACC.1999.786595