DocumentCode :
1517943
Title :
Online Learning of Rested and Restless Bandits
Author :
Tekin, Cem ; Liu, Mingyan
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Michigan, Ann Arbor, MI, USA
Volume :
58
Issue :
8
fYear :
2012
Firstpage :
5588
Lastpage :
5611
Abstract :
In this paper, we study the online learning problem involving rested and restless bandits, in both a centralized and a decentralized setting. In a centralized setting, the system consists of a single player/user and a set of K finite-state discrete-time Markov chains (arms) with unknown state spaces (rewards) and statistics. The objective of the player is to decide in each step which M of the K arms to play over a sequence of trials so as to maximize its long-term reward. In a decentralized setting, multiple uncoordinated players each makes its own decision on which arm to play in a step, and if two or more players select the same arm simultaneously, a collision results and none of the players selecting that arm gets a reward. The objective of each player again is to maximize its long-term reward. We first show that logarithmic regret algorithms exist both for the centralized rested and restless bandit problems. For the decentralized setting, we propose an algorithm with logarithmic regret with respect to the optimal centralized arm allocation. Numerical results and extensive discussion are also provided to highlight insights obtained from this study.
Keywords :
Markov processes; discrete time systems; distance learning; K finite-state discrete-time Markov chains; online learning; optimal centralized arm allocation; rested bandits; restless bandits; Algorithm design and analysis; Context; Eigenvalues and eigenfunctions; Fading; Indexes; Markov processes; Silicon; Exploration–exploitation tradeoff; multiarmed bandits; online learning; opportunistic spectrum access (OSA); regret; restless bandits;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2012.2198613
Filename :
6200864
Link To Document :
بازگشت