A two-phase time aggregation algorithm for average cost Markov decision processes

Author

Arruda, E.F. ; Fragoso, Marcelo D.

Author_Institution

PEP, Fed. Univ. of Rio de Janeiro-UFRJ, Rio de Janeiro, Brazil

fYear

2012

fDate

27-29 June 2012

Firstpage

1615

Lastpage

1620

Abstract

This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy evaluation in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then sequentially applied until convergence is attained or a prescribed running time is exceeded.

Keywords

Markov processes; state-space methods; average cost Markov decision process; policy evaluation; policy improvement step; state space embedding; two-phase time aggregation algorithm; Aerospace electronics; Function approximation; Hafnium; Markov processes; Poisson equations; Process control; Production; Dynamic Programming; Embedding; Markov Decision Processes; Stochastic Optimal Control; Time Aggregation;

fLanguage

English

Publisher

ieee

Conference_Titel

American Control Conference (ACC), 2012

Conference_Location

Montreal, QC

ISSN

0743-1619

Print_ISBN

978-1-4577-1095-7

Electronic_ISBN

0743-1619

Type

conf

DOI

10.1109/ACC.2012.6315187

Filename

6315187

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=574601