• DocumentCode
    574601
  • Title

    A two-phase time aggregation algorithm for average cost Markov decision processes

  • Author

    Arruda, E.F. ; Fragoso, Marcelo D.

  • Author_Institution
    PEP, Fed. Univ. of Rio de Janeiro-UFRJ, Rio de Janeiro, Brazil
  • fYear
    2012
  • fDate
    27-29 June 2012
  • Firstpage
    1615
  • Lastpage
    1620
  • Abstract
    This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy evaluation in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then sequentially applied until convergence is attained or a prescribed running time is exceeded.
  • Keywords
    Markov processes; state-space methods; average cost Markov decision process; policy evaluation; policy improvement step; state space embedding; two-phase time aggregation algorithm; Aerospace electronics; Function approximation; Hafnium; Markov processes; Poisson equations; Process control; Production; Dynamic Programming; Embedding; Markov Decision Processes; Stochastic Optimal Control; Time Aggregation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    American Control Conference (ACC), 2012
  • Conference_Location
    Montreal, QC
  • ISSN
    0743-1619
  • Print_ISBN
    978-1-4577-1095-7
  • Electronic_ISBN
    0743-1619
  • Type

    conf

  • DOI
    10.1109/ACC.2012.6315187
  • Filename
    6315187