• DocumentCode
    2324482
  • Title

    Two-time-scale online actor-critic paradigm driven by POMDP

  • Author

    Liu, Bo ; He, Haibo ; Repperger, Daniel W.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ, USA
  • fYear
    2010
  • fDate
    10-12 April 2010
  • Firstpage
    243
  • Lastpage
    248
  • Abstract
    In this paper, we analyze a class of actor-critic algorithms under partially observable Markov decision process (POMDP) environment. Specifically, in this work we focus on the two-time-scale framework in which the critic uses a temporal difference with neural network (NN) as nonlinear function approximator, and the actor is updated using greedy algorithm with the stochastic gradient approach. Instead of the common construction of hidden state estimator, we develop the idea originated from Singh, Jaakkola and Jordan (1994) into an online action-dependent actor-critic paradigm. This framework explores the ability of the adaptive dynamic programming (ADP) approach in POMDP environment without implementing extra architectures such as state estimators. Both the theoretical analysis and simulation studies validate that the framework performs effectively under certain assumptions given in this paper.
  • Keywords
    Markov processes; dynamic programming; function approximation; gradient methods; greedy algorithms; neural nets; POMDP environment; adaptive dynamic programming; greedy algorithm; neural network; nonlinear function approximator; online actor-critic paradigm; partially observable Markov decision process; stochastic gradient approach; temporal difference; two-time-scale framework; Algorithm design and analysis; Analytical models; Convergence; Dynamic programming; Greedy algorithms; Hidden Markov models; Neural networks; Robot sensing systems; State estimation; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networking, Sensing and Control (ICNSC), 2010 International Conference on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    978-1-4244-6450-0
  • Type

    conf

  • DOI
    10.1109/ICNSC.2010.5461491
  • Filename
    5461491