DocumentCode :
2324482
Title :
Two-time-scale online actor-critic paradigm driven by POMDP
Author :
Liu, Bo ; He, Haibo ; Repperger, Daniel W.
Author_Institution :
Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ, USA
fYear :
2010
fDate :
10-12 April 2010
Firstpage :
243
Lastpage :
248
Abstract :
In this paper, we analyze a class of actor-critic algorithms under partially observable Markov decision process (POMDP) environment. Specifically, in this work we focus on the two-time-scale framework in which the critic uses a temporal difference with neural network (NN) as nonlinear function approximator, and the actor is updated using greedy algorithm with the stochastic gradient approach. Instead of the common construction of hidden state estimator, we develop the idea originated from Singh, Jaakkola and Jordan (1994) into an online action-dependent actor-critic paradigm. This framework explores the ability of the adaptive dynamic programming (ADP) approach in POMDP environment without implementing extra architectures such as state estimators. Both the theoretical analysis and simulation studies validate that the framework performs effectively under certain assumptions given in this paper.
Keywords :
Markov processes; dynamic programming; function approximation; gradient methods; greedy algorithms; neural nets; POMDP environment; adaptive dynamic programming; greedy algorithm; neural network; nonlinear function approximator; online actor-critic paradigm; partially observable Markov decision process; stochastic gradient approach; temporal difference; two-time-scale framework; Algorithm design and analysis; Analytical models; Convergence; Dynamic programming; Greedy algorithms; Hidden Markov models; Neural networks; Robot sensing systems; State estimation; Stochastic processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Networking, Sensing and Control (ICNSC), 2010 International Conference on
Conference_Location :
Chicago, IL
Print_ISBN :
978-1-4244-6450-0
Type :
conf
DOI :
10.1109/ICNSC.2010.5461491
Filename :
5461491
Link To Document :
بازگشت