• DocumentCode
    1805450
  • Title

    Infinite-horizon gradient estimation for semi-Markov decision processes

  • Author

    Li, Yanjie ; Cao, Fang

  • Author_Institution
    Shenzhen Grad. Sch., Harbin Inst. of Technol., Shenzhen, China
  • fYear
    2011
  • fDate
    15-18 May 2011
  • Firstpage
    926
  • Lastpage
    931
  • Abstract
    This paper presents a performance gradient formula for semi-Markov decision processes with average reward criterion. With this formula, we propose an infinite-horizon online (sample-path based) gradient estimation algorithm. This algorithm naturally extend online gradient estimation algorithm for discrete-time Markov systems to continuous time semi-Markov models. In particular, the new algorithm requires less storage than the algorithm appeared in the literature.
  • Keywords
    Markov processes; continuous time systems; decision theory; discrete time systems; gradient methods; average reward criterion; continuous time semiMarkov models; discrete time Markov systems; infinite horizon online gradient estimation algorithm; semiMarkov decision processes; Algorithm design and analysis; Approximation algorithms; Approximation methods; Equations; Estimation; Markov processes; Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control Conference (ASCC), 2011 8th Asian
  • Conference_Location
    Kaohsiung
  • Print_ISBN
    978-1-61284-487-9
  • Electronic_ISBN
    978-89-956056-4-6
  • Type

    conf

  • Filename
    5899196