DocumentCode
1299955
Title
A learning algorithm for the finite-time two-armed bandit problem
Author
Sato, Mitsuhisa ; Abe, Kiyohiko ; Takeda, H.
Author_Institution
Dept. of Electrical Engng., Tohoku Univ., Sendai, Japan
Issue
3
fYear
1984
Firstpage
528
Lastpage
534
Abstract
A simple algorithm for the finite-time two-armed bandit problem is proposed. In this algorithm, the whole process is divided into the first estimating process and the next controlling process. Efficient methods involving the use of approximation for computing the optimal length of the estimating process are provided.
Keywords
game theory; learning systems; approximation; controlling process; estimating process; finite-time; learning algorithm; optimal length; two-armed bandit problem; Algorithm design and analysis; Approximation methods; Estimation; Nickel; Niobium; Process control; Skeleton;
fLanguage
English
Journal_Title
Systems, Man and Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
0018-9472
Type
jour
DOI
10.1109/TSMC.1984.6313253
Filename
6313253
Link To Document