• DocumentCode
    55438
  • Title

    Efficient Algorithms for Budget-Constrained Markov Decision Processes

  • Author

    Caramanis, Constantine ; Dimitrov, Nedialko B. ; Morton, David P.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Texas at Austin, Austin, TX, USA
  • Volume
    59
  • Issue
    10
  • fYear
    2014
  • fDate
    Oct. 2014
  • Firstpage
    2813
  • Lastpage
    2817
  • Abstract
    Discounted, discrete-time, discrete state-space, discrete action-space Markov decision processes (MDPs) form a classical topic in control, game theory, and learning, and as a result are widely applied, increasingly, in very large-scale applications. Many algorithms have been developed to solve large-scale MDPs. Algorithms based on value iteration are particularly popular, as they are more efficient than the generic linear programming approach, by an order of magnitude in the number of states of the MDP. Yet in the case of budget constrained MDPs, no more efficient algorithm than linear programming is known. The theoretically slower running times of linear programming may limit the scalability of constrained MDPs piratically; while, theoretically, it invites the question of whether the increase is somehow intrinsic. In this technical note we show that it is not, and provide two algorithms for budget-constrained MDPs that are as efficient as value iteration. Denoting the running time of value iteration by VI, and the magnitude of the input by U, for an MDP with m expected budget constraints our first algorithm runs in time O(poly(m, log U) · VI). Given a pre-specified degree of precision, η, for satisfying the budget constraints, our second algorithm runs in time O(log m · poly(log U) · (1/η2) · VI), but may produce solutions that overutilize each of the m budgets by a multiplicative factor of 1 + η. In fact, one can substitute value iteration with any algorithm, possibly specially designed for a specific MDP, that solves the MDP quickly to achieve similar theoretical guarantees. Both algorithms restrict attention to constrained infinite-horizon MDPs under discounted costs.
  • Keywords
    Markov processes; computational complexity; decision theory; linear programming; budget-constrained MDP; budget-constrained Markov decision processes; discrete action-space; discrete state-space; game theory; generic linear programming; large-scale applications; learning; value iteration; Algorithm design and analysis; Approximation algorithms; Educational institutions; Ellipsoids; Linear programming; Markov processes; Vectors; Markov decision processes (MDPs);
  • fLanguage
    English
  • Journal_Title
    Automatic Control, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9286
  • Type

    jour

  • DOI
    10.1109/TAC.2014.2314211
  • Filename
    6780601