On Optimality of Myopic Policy for Restless Multi-Armed Bandit Problem: An Axiomatic Approach

Author

Wang, Kehao ; Chen, Lin

Author_Institution

Sch. of Inf. Eng., Wuhan Univ. of Technol., Wuhan, China

Volume

60

Issue

1

fYear

2012

Firstpage

300

Lastpage

309

Abstract

Due to its application in numerous engineering problems, the restless multi-armed bandit (RMAB) problem is of fundamental importance in stochastic decision theory. However, solving the RMAB problem is well known to be PSPACE-hard, with the optimal policy usually intractable due to the exponential computation complexity. A natural alternative approach is to seek simple myopic policies which are easy to implement. This paper presents a generic study on the optimality of the myopic policy for the RMAB problem. More specifically, we develop three axioms characterizing a family of generic and practically important functions termed as regular functions. By performing a mathematical analysis based on the developed axioms, we establish the closed-form conditions under which the myopic policy is guaranteed to be optimal. The axiomatic analysis also illuminates important engineering implications of the myopic policy including the intrinsic tradeoff between exploration and exploitation. A case study is then presented to illustrate the application of the derived results in analyzing a class of RMAB problems arising from multi-channel opportunistic access.

Keywords

communication complexity; decision theory; mathematical analysis; stochastic processes; wireless channels; PSPACE-hard problem; RMAB problem; closed-form condition; exponential computation complexity; mathematical analysis; multichannel opportunistic access; myopic policy; restless multiarmed bandit problem; stochastic decision theory; Complexity theory; Context; Decision theory; Educational institutions; Focusing; Indexes; Markov processes; Myopic policy; opportunistic spectrum access (OSA); restless multi-armed bandit (RMAB) problem;

fLanguage

English

Journal_Title

Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

1053-587X

Type

jour

DOI

10.1109/TSP.2011.2170684

Filename

6035799