مرکز منطقه ای اطلاع رساني علوم و فناوري - Dynamic Optimization and Learning for Renewal Systems

DocumentCode :

18856

Title :

Dynamic Optimization and Learning for Renewal Systems

Author :

Neely, Michael J.

Author_Institution :

Electr. Eng. Dept., Univ. of Southern California, Los Angeles, CA, USA

Volume :

Issue :

fYear :

2013

fDate :

Jan. 2013

Firstpage :

Lastpage :

Abstract :

This paper considers optimization of time averages in systems with variable length renewal frames. Applications include power-aware and profit-aware scheduling in wireless networks, peer-to-peer networks, and transportation systems. Every frame, a new policy is implemented that affects the frame size and that creates a vector of attributes. The policy can be a single decision in response to a random event observed on the frame, or a sequence of such decisions. The goal is to choose policies on each frame in order to maximize a time average of one attribute, subject to additional time average constraints on the others. Two algorithms are developed, both based on Lyapunov optimization concepts. The first makes decisions to minimize a “drift-plus-penalty” ratio over each frame. The second is similar but does not involve a ratio. For systems that make only a single decision on each frame, both algorithms are shown to learn efficient behavior without a-priori statistical knowledge. The framework is also applicable to large classes of constrained Markov decision problems. Such problems are reduced to finding an approximate solution to a simpler unconstrained stochastic shortest path problem on each frame. Approximations for the simpler problem may still suffer from a curse of dimensionality for systems with large state space. However, our results are compatible with any approximation method, and demonstrate an explicit tradeoff between performance and convergence time.

Keywords :

Lyapunov methods; Markov processes; decision making; optimisation; stochastic systems; Lyapunov optimization concepts; constrained Markov decision problems; drift-plus-penalty ratio; dynamic optimization; optimization; peer-to-peer networks; power-aware scheduling; profit-aware scheduling; renewal systems; time average constraints; time averages; transportation systems; unconstrained stochastic shortest path problem; variable length renewal frames; wireless networks; Algorithm design and analysis; Approximation methods; Convergence; Heuristic algorithms; Optimization; Protocols; Vectors; Markov decision problems (MDPs); stochastic processes;

fLanguage :

English

Journal_Title :

Automatic Control, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9286

Type :

jour

DOI :

10.1109/TAC.2012.2204831

Filename :

6218174

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=18856