DocumentCode
2324482
Title
Two-time-scale online actor-critic paradigm driven by POMDP
Author
Liu, Bo ; He, Haibo ; Repperger, Daniel W.
Author_Institution
Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ, USA
fYear
2010
fDate
10-12 April 2010
Firstpage
243
Lastpage
248
Abstract
In this paper, we analyze a class of actor-critic algorithms under partially observable Markov decision process (POMDP) environment. Specifically, in this work we focus on the two-time-scale framework in which the critic uses a temporal difference with neural network (NN) as nonlinear function approximator, and the actor is updated using greedy algorithm with the stochastic gradient approach. Instead of the common construction of hidden state estimator, we develop the idea originated from Singh, Jaakkola and Jordan (1994) into an online action-dependent actor-critic paradigm. This framework explores the ability of the adaptive dynamic programming (ADP) approach in POMDP environment without implementing extra architectures such as state estimators. Both the theoretical analysis and simulation studies validate that the framework performs effectively under certain assumptions given in this paper.
Keywords
Markov processes; dynamic programming; function approximation; gradient methods; greedy algorithms; neural nets; POMDP environment; adaptive dynamic programming; greedy algorithm; neural network; nonlinear function approximator; online actor-critic paradigm; partially observable Markov decision process; stochastic gradient approach; temporal difference; two-time-scale framework; Algorithm design and analysis; Analytical models; Convergence; Dynamic programming; Greedy algorithms; Hidden Markov models; Neural networks; Robot sensing systems; State estimation; Stochastic processes;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking, Sensing and Control (ICNSC), 2010 International Conference on
Conference_Location
Chicago, IL
Print_ISBN
978-1-4244-6450-0
Type
conf
DOI
10.1109/ICNSC.2010.5461491
Filename
5461491
Link To Document