مرکز منطقه ای اطلاع رساني علوم و فناوري - Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards

DocumentCode :

854438

Title :

Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards

Author :

Anantharam, Venkatachalam ; Varaiya, Pravin ; Walrand, Jean

Author_Institution :

Cornell University, Ithaca, NY, USA

Volume :

Issue :

fYear :

1987

fDate :

11/1/1987 12:00:00 AM

Firstpage :

968

Lastpage :

976

Abstract :

At each instant of time we are required to sample a fixed number $m \\geq 1$ out of $N$ i.i.d, processes whose distributions belong to a family suitably parameterized by a real number $\\theta$ . The objective is to maximize the long run total expected value of the samples. Following Lai and Robbins, the learning loss of a sampling scheme corresponding to a configuration of parameters $C = (\\theta_{1},..., \\theta_{N})$ is quantified by the regret $R_{n}(C)$ . This is the difference between the maximum expected reward at time $n$ that could be achieved if $C$ were known and the expected reward actually obtained by the sampling scheme. We provide a lower bound for the regret associated with any uniformly good scheme, and construct a scheme which attains the lower bound for every configuration $C$ . The lower bound is given explicitly in terms of the Kullback-Liebler number between pairs of distributions. Part II of this paper considers the same problem when the reward processes are Markovian.

Keywords :

Adaptive control; Optimal stochastic control; Resource management; Stochastic optimal control; Arm; Computer aided manufacturing; Computer science; Density measurement; Laboratories; Manufacturing systems; Resource management; Sampling methods; State-space methods; Statistics;

fLanguage :

English

Journal_Title :

Automatic Control, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9286

Type :

jour

DOI :

10.1109/TAC.1987.1104491

Filename :

1104491

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=854438