A learning algorithm for the finite-time two-armed bandit problem

Author

Sato, Mitsuhisa ; Abe, Kiyohiko ; Takeda, H.

Author_Institution

Dept. of Electrical Engng., Tohoku Univ., Sendai, Japan

Issue

fYear

1984

Firstpage

528

Lastpage

534

Abstract

A simple algorithm for the finite-time two-armed bandit problem is proposed. In this algorithm, the whole process is divided into the first estimating process and the next controlling process. Efficient methods involving the use of approximation for computing the optimal length of the estimating process are provided.

Keywords

game theory; learning systems; approximation; controlling process; estimating process; finite-time; learning algorithm; optimal length; two-armed bandit problem; Algorithm design and analysis; Approximation methods; Estimation; Nickel; Niobium; Process control; Skeleton;

fLanguage

English

Journal_Title

Systems, Man and Cybernetics, IEEE Transactions on

Publisher

ieee

ISSN

0018-9472

Type

jour

DOI

10.1109/TSMC.1984.6313253

Filename

6313253

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=1299955