• DocumentCode
    2297122
  • Title

    An exemplar test problem on parameter convergence analysis of temporal difference algorithms

  • Author

    Brown, Martin ; Tutsoy, Onder

  • Author_Institution
    Control Syst. Group, Univ. of Manchester, Manchester, UK
  • fYear
    2012
  • fDate
    6-8 July 2012
  • Firstpage
    2925
  • Lastpage
    2930
  • Abstract
    Reinforcement learning techniques have been developed to solve difficult learning control problems having small amount of a priori knowledge about the system dynamics. In this paper, a simple unstable exemplar test problem is proposed to investigate issues in parametric convergence of the value function. A specific closed-form solution for the value function is determined which has a polynomial form. It is proved that the temporal difference error introduces a null space associated with the finite horizon basis function during the control trajectory. The learning problem can be only nonsingular if the termination is handled correctly, and a number of possible solutions are introduced. This result was only revealed because of the derived closed form solution for the value function.
  • Keywords
    convergence; infinite horizon; learning (artificial intelligence); learning systems; polynomials; a priori knowledge; closed-form solution; control trajectory; finite horizon basis function; learning control problems; learning problem; null space; parameter convergence analysis; parametric convergence; polynomial form; reinforcement learning techniques; system dynamics; temporal difference algorithms; temporal difference error; unstable exemplar test problem; value function; Algorithm design and analysis; Closed-form solutions; Convergence; Null space; Polynomials; Trajectory; Vectors; Reinforcement learning; polynomial basis functions; rate of convergence; temporal difference learning; value function approximation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Control and Automation (WCICA), 2012 10th World Congress on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4673-1397-1
  • Type

    conf

  • DOI
    10.1109/WCICA.2012.6358370
  • Filename
    6358370