DocumentCode
18856
Title
Dynamic Optimization and Learning for Renewal Systems
Author
Neely, Michael J.
Author_Institution
Electr. Eng. Dept., Univ. of Southern California, Los Angeles, CA, USA
Volume
58
Issue
1
fYear
2013
fDate
Jan. 2013
Firstpage
32
Lastpage
46
Abstract
This paper considers optimization of time averages in systems with variable length renewal frames. Applications include power-aware and profit-aware scheduling in wireless networks, peer-to-peer networks, and transportation systems. Every frame, a new policy is implemented that affects the frame size and that creates a vector of attributes. The policy can be a single decision in response to a random event observed on the frame, or a sequence of such decisions. The goal is to choose policies on each frame in order to maximize a time average of one attribute, subject to additional time average constraints on the others. Two algorithms are developed, both based on Lyapunov optimization concepts. The first makes decisions to minimize a “drift-plus-penalty” ratio over each frame. The second is similar but does not involve a ratio. For systems that make only a single decision on each frame, both algorithms are shown to learn efficient behavior without a-priori statistical knowledge. The framework is also applicable to large classes of constrained Markov decision problems. Such problems are reduced to finding an approximate solution to a simpler unconstrained stochastic shortest path problem on each frame. Approximations for the simpler problem may still suffer from a curse of dimensionality for systems with large state space. However, our results are compatible with any approximation method, and demonstrate an explicit tradeoff between performance and convergence time.
Keywords
Lyapunov methods; Markov processes; decision making; optimisation; stochastic systems; Lyapunov optimization concepts; constrained Markov decision problems; drift-plus-penalty ratio; dynamic optimization; optimization; peer-to-peer networks; power-aware scheduling; profit-aware scheduling; renewal systems; time average constraints; time averages; transportation systems; unconstrained stochastic shortest path problem; variable length renewal frames; wireless networks; Algorithm design and analysis; Approximation methods; Convergence; Heuristic algorithms; Optimization; Protocols; Vectors; Markov decision problems (MDPs); stochastic processes;
fLanguage
English
Journal_Title
Automatic Control, IEEE Transactions on
Publisher
ieee
ISSN
0018-9286
Type
jour
DOI
10.1109/TAC.2012.2204831
Filename
6218174
Link To Document