DocumentCode :
574601
Title :
A two-phase time aggregation algorithm for average cost Markov decision processes
Author :
Arruda, E.F. ; Fragoso, Marcelo D.
Author_Institution :
PEP, Fed. Univ. of Rio de Janeiro-UFRJ, Rio de Janeiro, Brazil
fYear :
2012
fDate :
27-29 June 2012
Firstpage :
1615
Lastpage :
1620
Abstract :
This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy evaluation in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then sequentially applied until convergence is attained or a prescribed running time is exceeded.
Keywords :
Markov processes; state-space methods; average cost Markov decision process; policy evaluation; policy improvement step; state space embedding; two-phase time aggregation algorithm; Aerospace electronics; Function approximation; Hafnium; Markov processes; Poisson equations; Process control; Production; Dynamic Programming; Embedding; Markov Decision Processes; Stochastic Optimal Control; Time Aggregation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
American Control Conference (ACC), 2012
Conference_Location :
Montreal, QC
ISSN :
0743-1619
Print_ISBN :
978-1-4577-1095-7
Electronic_ISBN :
0743-1619
Type :
conf
DOI :
10.1109/ACC.2012.6315187
Filename :
6315187
Link To Document :
بازگشت