• DocumentCode
    18856
  • Title

    Dynamic Optimization and Learning for Renewal Systems

  • Author

    Neely, Michael J.

  • Author_Institution
    Electr. Eng. Dept., Univ. of Southern California, Los Angeles, CA, USA
  • Volume
    58
  • Issue
    1
  • fYear
    2013
  • fDate
    Jan. 2013
  • Firstpage
    32
  • Lastpage
    46
  • Abstract
    This paper considers optimization of time averages in systems with variable length renewal frames. Applications include power-aware and profit-aware scheduling in wireless networks, peer-to-peer networks, and transportation systems. Every frame, a new policy is implemented that affects the frame size and that creates a vector of attributes. The policy can be a single decision in response to a random event observed on the frame, or a sequence of such decisions. The goal is to choose policies on each frame in order to maximize a time average of one attribute, subject to additional time average constraints on the others. Two algorithms are developed, both based on Lyapunov optimization concepts. The first makes decisions to minimize a “drift-plus-penalty” ratio over each frame. The second is similar but does not involve a ratio. For systems that make only a single decision on each frame, both algorithms are shown to learn efficient behavior without a-priori statistical knowledge. The framework is also applicable to large classes of constrained Markov decision problems. Such problems are reduced to finding an approximate solution to a simpler unconstrained stochastic shortest path problem on each frame. Approximations for the simpler problem may still suffer from a curse of dimensionality for systems with large state space. However, our results are compatible with any approximation method, and demonstrate an explicit tradeoff between performance and convergence time.
  • Keywords
    Lyapunov methods; Markov processes; decision making; optimisation; stochastic systems; Lyapunov optimization concepts; constrained Markov decision problems; drift-plus-penalty ratio; dynamic optimization; optimization; peer-to-peer networks; power-aware scheduling; profit-aware scheduling; renewal systems; time average constraints; time averages; transportation systems; unconstrained stochastic shortest path problem; variable length renewal frames; wireless networks; Algorithm design and analysis; Approximation methods; Convergence; Heuristic algorithms; Optimization; Protocols; Vectors; Markov decision problems (MDPs); stochastic processes;
  • fLanguage
    English
  • Journal_Title
    Automatic Control, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9286
  • Type

    jour

  • DOI
    10.1109/TAC.2012.2204831
  • Filename
    6218174