Markov Decision Processes with Threshold Based Piecewise Linear Optimal Policies

Author

Erseghe, Tomaso ; Zanella, A. ; Codemo, Claudio G.

Author_Institution

Dipt. di Ing. dell´Inf., Univ. di Padova, Padua, Italy

Volume

2

Issue

4

fYear

2013

fDate

Aug-13

Firstpage

459

Lastpage

462

Abstract

This letter investigates the structure of the optimal policy for a class of Markov decision processes (MDPs), having convex and piecewise linear cost function. The optimal policy is proved to have a piecewise linear structure that alternates flat and constant-slope pieces, resembling a staircase with tilted rises and as many steps (thresholds) as the breakpoints of the cost function. This particular structure makes it possible to express the policy in a very compact manner, particularly suitable to be stored in low-end devices. More importantly, the threshold-based form of the optimal policy can be exploited to reduce the computational complexity of the iterative dynamic programming (DP) algorithm used to solve the problem. These results apply to a rather wide set of optimization problems, typically involving the management of a resource buffer such as the energy stored in a battery, or the packets queued in a wireless node.

Keywords

Markov processes; MDP; Markov decision processes; computational complexity; constant slope pieces; iterative dynamic programming algorithm; optimal policy; piecewise linear cost function; piecewise linear structure; resource buffer; threshold based form; threshold based piecewise linear optimal policies; wireless node; Dynamic programming; Markov decision processes; low complexity; optimization; piecewise linear policy;

fLanguage

English

Journal_Title

Wireless Communications Letters, IEEE

Publisher

ieee

ISSN

2162-2337

Type

jour

DOI

10.1109/WCL.2013.052813.130213

Filename

6523044