• DocumentCode
    1527542
  • Title

    Approximate Robust Policy Iteration Using Multilayer Perceptron Neural Networks for Discounted Infinite-Horizon Markov Decision Processes With Uncertain Correlated Transition Matrices

  • Author

    Li, Baohua ; Si, Jennie

  • Author_Institution
    Arkansas Inst. for Nanomater. Sci. & Eng., Univ. of Arkansas, Fayetteville, AR, USA
  • Volume
    21
  • Issue
    8
  • fYear
    2010
  • Firstpage
    1270
  • Lastpage
    1280
  • Abstract
    We study finite-state, finite-action, discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices in deterministic policy spaces. Existing robust dynamic programming methods cannot be extended to solving this class of general problems. In this paper, based on a robust optimality criterion, an approximate robust policy iteration using a multilayer perceptron neural network is proposed. It is proven that the proposed algorithm converges in finite iterations, and it converges to a stationary optimal or near-optimal policy in a probability sense. In addition, we point out that sometimes even a direct enumeration may not be applicable to addressing this class of problems. However, a direct enumeration based on our proposed maximum value approximation over the parameter space is a feasible approach. We provide further analysis to show that our proposed algorithm is more efficient than such an enumeration method for various scenarios.
  • Keywords
    Markov processes; dynamic programming; iterative methods; matrix algebra; multilayer perceptrons; Markov decision process; approximate robust policy iteration; deterministic policy spaces; direct enumeration method; discounted infinite-horizon; dynamic programming methods; multilayer perceptron neural networks; robust optimality criterion; uncertain correlated transition matrices; Algorithm design and analysis; Cost function; Dynamic programming; Equations; Estimation error; Multi-layer neural network; Multilayer perceptrons; Neural networks; Noise robustness; Uncertainty; Approximate dynamic programming; Markov decision processes (MDP); multilayer perceptrons; uncertain transition matrix; Algorithms; Animals; Artificial Intelligence; Decision Support Techniques; Decision Theory; Humans; Markov Chains; Mathematical Concepts; Neural Networks (Computer);
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2010.2050334
  • Filename
    5499042