مرکز منطقه ای اطلاع رساني علوم و فناوري - Relay selection problem in wireless networks: A solution concept based on stochastic bandits and calibrated forecasters

DocumentCode :

3226865

Title :

Relay selection problem in wireless networks: A solution concept based on stochastic bandits and calibrated forecasters

Author :

Maghsudi, Setareh ; Stanczak, Slawomir

Author_Institution :

Tech. Univ. Berlin, Berlin, Germany

fYear :

2013

fDate :

16-19 June 2013

Firstpage :

385

Lastpage :

389

Abstract :

Multi-armed bandit (MAB) problems form a class of sequential optimization problems, in which a player sequentially pulls an arm, selected from a known and finite set of arms, in order to achieve an initially unknown reward. The player aims at maximizing the accumulated reward over the game horizon. In stochastic bandits with side information, it is assumed that some side information is revealed to the player at the beginning of each game trial, and that the instantaneous rewards of each arm are drawn according to a probability distribution attributed to that arm. In this paper, we study the relay selection problem in wireless networks, where users are provided with no channel knowledge. Under the assumption of orthogonal transmission, and from the view point of each user, we show that the selection problem boils down to a stochastic MAB game, with the side information being the predicted joint action profile of other users. We propose a selection strategy to solve this game, which consists of two parallel routines. In the first routine, each player employs a calibrated forecaster to predict the joint action profile of other players, i.e. to obtain the side information. In the second routine, the player balances the exploration-exploitation tradeoff, i.e. estimates the reward generating process of arms while trying to attain the maximum achievable accumulated reward. Under reasonable assumptions, the proposed strategy is strongly consistent, in the sense that the accumulated reward of each individual player is not much less than that of the best arm, asymptotically almost surely. Moreover, upon using this strategy, the selection game converges to the set of correlated equilibria.

Keywords :

game theory; relay networks (telecommunication); wireless channels; MAB problems; calibrated forecasters; channel knowledge; game horizon; multiarmed bandit; orthogonal transmission; probability distribution; relay selection problem; sequential optimization problems; stochastic MAB game; stochastic bandits; wireless networks; Games; Indexes; Joints; Probability distribution; Relays; Signal processing algorithms; Throughput;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing Advances in Wireless Communications (SPAWC), 2013 IEEE 14th Workshop on

Conference_Location :

Darmstadt

ISSN :

1948-3244

Type :

conf

DOI :

10.1109/SPAWC.2013.6612077

Filename :

6612077

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3226865