Minimax lower bounds for the two-armed bandit problem

Author

Kulkarni, Sanjeev R. ; Lugosi, Gabor

Author_Institution

Dept. of Electr. Eng., Princeton Univ., NJ, USA

Volume

fYear

1997

fDate

10-12 Dec 1997

Firstpage

2293

Abstract

We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins (1985). Also, in contrast to the log n asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for every allocation rule and for every n, there is a configuration such that the regret at time n is at least 1-ε times the regret of random guessing, where ε is any small positive constant

Keywords

minimax techniques; random processes; asymptotic lower bound; finite-sample minimax version; minimax lower bounds; minimax regret; random guessing; two-armed bandit problem; Arm; Density measurement; Minimax techniques;

fLanguage

English

Publisher

ieee

Conference_Titel

Decision and Control, 1997., Proceedings of the 36th IEEE Conference on

Conference_Location

San Diego, CA

ISSN

0191-2216

Print_ISBN

0-7803-4187-2

Type

conf

DOI

10.1109/CDC.1997.657117

Filename

657117

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=321199