DocumentCode :
2483152
Title :
Natural Gradient Policy for Average Cost SMDP Problem
Author :
Vien, Ngo Anh ; Chung, TaeChoong
Author_Institution :
Kyung Hee Univ., Seoul
Volume :
1
fYear :
2007
fDate :
29-31 Oct. 2007
Firstpage :
11
Lastpage :
18
Abstract :
Semi-markov decision processes (SMDP) are continuous time generalizations of discrete time Markov Decision Process. A number of value and policy iteration algorithms have been developed for the solution of SMDP problem. But solving SMDP problem requires prior knowledge of the deterministic kernels, and suffers from the curse of dimensionality. In this paper, we present the steepest descent direction based on a family of parameterized policies to overcome those limitations. The update rule is based on stochastic policy gradients employing Amari´s natural gradient approach that is moving toward choosing a greedy optimal action. We then show considerable performance improvements of this method in the simple two-state SMDP problem and in the more complex SMDP of call admission control problem.
Keywords :
Markov processes; gradient methods; call admission control problem; discrete time Markov decision process; natural gradient policy; policy iteration algorithms; semiMarkov decision processes; steepest descent direction; stochastic policy gradients; Artificial intelligence; Call admission control; Costs; Distributed computing; Dynamic programming; Function approximation; Gradient methods; Kernel; Laboratories; Stochastic processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
Conference_Location :
Patras
ISSN :
1082-3409
Print_ISBN :
978-0-7695-3015-4
Type :
conf
DOI :
10.1109/ICTAI.2007.12
Filename :
4410255
Link To Document :
بازگشت