مرکز منطقه ای اطلاع رساني علوم و فناوري - Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems

DocumentCode :

918500

Title :

Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems

Author :

Cope, Eric W.

Author_Institution :

IBM Zurich Res. Lab., Ruschlikon

Volume :

Issue :

fYear :

2009

fDate :

6/1/2009 12:00:00 AM

Firstpage :

1243

Lastpage :

1253

Abstract :

We consider a class of multi-armed bandit problems where the set of available actions can be mapped to a convex, compact region of Ropf^d, sometimes denoted as the ldquocontinuum-armed banditrdquo problem. The paper establishes bounds on the efficiency of any arm-selection procedure under certain conditions on the class of possible underlying reward functions. Both finite-time lower bounds on the growth rate of the regret, as well as asymptotic upper bounds on the rates of convergence of the selected control values to the optimum are derived. We explicitly characterize the dependence of these convergence rates on the minimal rate of variation of the mean reward function in a neighborhood of the optimal control. The bounds can be used to demonstrate the asymptotic optimality of the Kiefer-Wolfowitz method of stochastic approximation with regard to a large class of possible mean reward functions.

Keywords :

adaptive control; optimal control; stochastic systems; Kiefer-Wolfowitz method; adaptive control; arm-selection procedure; asymptotic upper bounds; continuum-armed bandit problems; convergence bounds; finite-time lower bounds; optimal control; regret bounds; reward functions; sequential decision procedures; stochastic approximation; Context modeling; Convergence; Extraterrestrial measurements; Helium; Learning; Optimal control; Search problems; Stochastic processes; Temperature control; Upper bound; Adaptive control; sequential decision procedures; stochastic approximation;

fLanguage :

English

Journal_Title :

Automatic Control, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9286

Type :

jour

DOI :

10.1109/TAC.2009.2019797

Filename :

4982679

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=918500