مرکز منطقه ای اطلاع رساني علوم و فناوري - A two-phase time aggregation algorithm for average cost Markov decision processes

DocumentCode :

574601

Title :

A two-phase time aggregation algorithm for average cost Markov decision processes

Author :

Arruda, E.F. ; Fragoso, Marcelo D.

Author_Institution :

PEP, Fed. Univ. of Rio de Janeiro-UFRJ, Rio de Janeiro, Brazil

fYear :

2012

fDate :

27-29 June 2012

Firstpage :

1615

Lastpage :

1620

Abstract :

This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy evaluation in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then sequentially applied until convergence is attained or a prescribed running time is exceeded.

Keywords :

Markov processes; state-space methods; average cost Markov decision process; policy evaluation; policy improvement step; state space embedding; two-phase time aggregation algorithm; Aerospace electronics; Function approximation; Hafnium; Markov processes; Poisson equations; Process control; Production; Dynamic Programming; Embedding; Markov Decision Processes; Stochastic Optimal Control; Time Aggregation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

American Control Conference (ACC), 2012

Conference_Location :

Montreal, QC

ISSN :

0743-1619

Print_ISBN :

978-1-4577-1095-7

Electronic_ISBN :

0743-1619

Type :

conf

DOI :

10.1109/ACC.2012.6315187

Filename :

6315187

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=574601