DocumentCode
574601
Title
A two-phase time aggregation algorithm for average cost Markov decision processes
Author
Arruda, E.F. ; Fragoso, Marcelo D.
Author_Institution
PEP, Fed. Univ. of Rio de Janeiro-UFRJ, Rio de Janeiro, Brazil
fYear
2012
fDate
27-29 June 2012
Firstpage
1615
Lastpage
1620
Abstract
This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy evaluation in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then sequentially applied until convergence is attained or a prescribed running time is exceeded.
Keywords
Markov processes; state-space methods; average cost Markov decision process; policy evaluation; policy improvement step; state space embedding; two-phase time aggregation algorithm; Aerospace electronics; Function approximation; Hafnium; Markov processes; Poisson equations; Process control; Production; Dynamic Programming; Embedding; Markov Decision Processes; Stochastic Optimal Control; Time Aggregation;
fLanguage
English
Publisher
ieee
Conference_Titel
American Control Conference (ACC), 2012
Conference_Location
Montreal, QC
ISSN
0743-1619
Print_ISBN
978-1-4577-1095-7
Electronic_ISBN
0743-1619
Type
conf
DOI
10.1109/ACC.2012.6315187
Filename
6315187
Link To Document