• DocumentCode
    1341613
  • Title

    On Optimality of Myopic Policy for Restless Multi-Armed Bandit Problem: An Axiomatic Approach

  • Author

    Wang, Kehao ; Chen, Lin

  • Author_Institution
    Sch. of Inf. Eng., Wuhan Univ. of Technol., Wuhan, China
  • Volume
    60
  • Issue
    1
  • fYear
    2012
  • Firstpage
    300
  • Lastpage
    309
  • Abstract
    Due to its application in numerous engineering problems, the restless multi-armed bandit (RMAB) problem is of fundamental importance in stochastic decision theory. However, solving the RMAB problem is well known to be PSPACE-hard, with the optimal policy usually intractable due to the exponential computation complexity. A natural alternative approach is to seek simple myopic policies which are easy to implement. This paper presents a generic study on the optimality of the myopic policy for the RMAB problem. More specifically, we develop three axioms characterizing a family of generic and practically important functions termed as regular functions. By performing a mathematical analysis based on the developed axioms, we establish the closed-form conditions under which the myopic policy is guaranteed to be optimal. The axiomatic analysis also illuminates important engineering implications of the myopic policy including the intrinsic tradeoff between exploration and exploitation. A case study is then presented to illustrate the application of the derived results in analyzing a class of RMAB problems arising from multi-channel opportunistic access.
  • Keywords
    communication complexity; decision theory; mathematical analysis; stochastic processes; wireless channels; PSPACE-hard problem; RMAB problem; closed-form condition; exponential computation complexity; mathematical analysis; multichannel opportunistic access; myopic policy; restless multiarmed bandit problem; stochastic decision theory; Complexity theory; Context; Decision theory; Educational institutions; Focusing; Indexes; Markov processes; Myopic policy; opportunistic spectrum access (OSA); restless multi-armed bandit (RMAB) problem;
  • fLanguage
    English
  • Journal_Title
    Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1053-587X
  • Type

    jour

  • DOI
    10.1109/TSP.2011.2170684
  • Filename
    6035799