• DocumentCode
    3782285
  • Title

    A mean-square asymptotic analysis of a constant stepsize temporal-difference learning algorithm

  • Author

    V. Tadic

  • Author_Institution
    Mihajlo Pupin Inst., Belgrade, Serbia
  • Volume
    4
  • fYear
    1999
  • Firstpage
    2862
  • Abstract
    The mean-square asymptotic behavior of constant stepsize temporal-difference algorithms is analyzed. The analysis is carried out for the case of a linear (cost-to-go) function approximation and for the case of Markov chains with an uncountable state space. An asymptotic upper bound for the mean-square deviation of the algorithm iterations from the optimal value of the parameter of the (cost-to-go) function approximator achievable by temporal-difference learning is determined as a function of stepsize.
  • Keywords
    "Algorithm design and analysis","Function approximation","Approximation algorithms","State-space methods","Upper bound","Convergence","Stochastic systems","Predictive models","Control system analysis","Time series analysis"
  • Publisher
    ieee
  • Conference_Titel
    American Control Conference, 1999. Proceedings of the 1999
  • ISSN
    0743-1619
  • Print_ISBN
    0-7803-4990-3
  • Type

    conf

  • DOI
    10.1109/ACC.1999.786595
  • Filename
    786595