DocumentCode :
3133171
Title :
Policy iteration for parameterized Markov decision processes and its application
Author :
Li Xia ; Qing-Shan Jia
Author_Institution :
Center for Intell. & Networked Syst. (CFINS) Dept. of Autom., Tsinghua Univ., Beijing, China
fYear :
2013
fDate :
23-26 June 2013
Firstpage :
1
Lastpage :
6
Abstract :
In a parameterized Markov decision process (MDP), the decision maker has to choose the optimal parameters which induce the maximal average system reward. However, the traditional policy iteration algorithm is usually inapplicable because the parameters choosing is not independent of the system state. In this paper, we use the direct comparison approach to study this problem. A general difference equation is derived to compare the performance difference under different parameters. We derive a theoretical condition that can guarantee the application of policy iteration to the parameterized MDP. This policy iteration type algorithm is much more efficient than the gradient optimization algorithm for parameterized MDP. Finally, we study the service rate control problem of closed Jackson networks as an example to demonstrate the main idea of this paper.
Keywords :
Markov processes; decision theory; iterative methods; closed Jackson networks; direct comparison approach; general difference equation; maximal average system reward; parameterized MDP; parameterized Markov decision processes; policy iteration type algorithm; service rate control problem; Difference equations; Markov processes; Mathematical model; Optimization; Servers; System performance; Vectors; Markov decision process; direct comparison; parameterized policy; policy iteration; service rate control;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference (ASCC), 2013 9th Asian
Conference_Location :
Istanbul
Print_ISBN :
978-1-4673-5767-8
Type :
conf
DOI :
10.1109/ASCC.2013.6606023
Filename :
6606023
Link To Document :
بازگشت