Title :
Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players
Author :
Liu, Haoyang ; Liu, Keqin ; Zhao, Qing
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, Davis, CA, USA
Abstract :
We consider decentralized restless multi-armed bandit problems with unknown dynamics and multiple players. The reward state of each arm transits according to an unknown Markovian rule when it is played and evolves according to an arbitrary unknown random process when it is passive. Players activating the same arm at the same time collide and suffer from reward loss. The objective is to maximize the long-term reward by designing a decentralized arm selection policy to address unknown reward models and collisions among players. A decentralized policy is constructed that achieves a regret with logarithmic order. The result finds applications in communication networks, financial investment, and industrial engineering.
Keywords :
Markov processes; game theory; Markovian rule; communication networks; decentralized arm selection policy; financial investment; industrial engineering; multiarmed bandit problems; multiple players; nonBayesian restless bandit; reward models; reward state; History; Indexes; Loss measurement; Markov processes; Random processes; Synchronization; Upper bound;
Conference_Titel :
Information Theory and Applications Workshop (ITA), 2011
Conference_Location :
La Jolla, CA
Print_ISBN :
978-1-4577-0360-7
DOI :
10.1109/ITA.2011.5743588