DocumentCode :
2850317
Title :
Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players
Author :
Liu, Haoyang ; Liu, Keqin ; Zhao, Qing
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, Davis, CA, USA
fYear :
2011
fDate :
6-11 Feb. 2011
Firstpage :
1
Lastpage :
7
Abstract :
We consider decentralized restless multi-armed bandit problems with unknown dynamics and multiple players. The reward state of each arm transits according to an unknown Markovian rule when it is played and evolves according to an arbitrary unknown random process when it is passive. Players activating the same arm at the same time collide and suffer from reward loss. The objective is to maximize the long-term reward by designing a decentralized arm selection policy to address unknown reward models and collisions among players. A decentralized policy is constructed that achieves a regret with logarithmic order. The result finds applications in communication networks, financial investment, and industrial engineering.
Keywords :
Markov processes; game theory; Markovian rule; communication networks; decentralized arm selection policy; financial investment; industrial engineering; multiarmed bandit problems; multiple players; nonBayesian restless bandit; reward models; reward state; History; Indexes; Loss measurement; Markov processes; Random processes; Synchronization; Upper bound;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Theory and Applications Workshop (ITA), 2011
Conference_Location :
La Jolla, CA
Print_ISBN :
978-1-4577-0360-7
Type :
conf
DOI :
10.1109/ITA.2011.5743588
Filename :
5743588
Link To Document :
بازگشت