مرکز منطقه ای اطلاع رساني علوم و فناوري - On the optimal solution of the one-armed bandit adaptive control problem

DocumentCode :

837087

Title :

On the optimal solution of the one-armed bandit adaptive control problem

Author :

Kumar, P.R. ; Seidman, Thomas I.

Author_Institution :

University of Maryland, Baltimore County, Baltimore, MD, USA

Volume :

Issue :

fYear :

1981

fDate :

10/1/1981 12:00:00 AM

Firstpage :

1176

Lastpage :

1184

Abstract :

The "one-armed bandit" problem of Bellman is a classic problem in sequential adaptive control. It is important a) for its own direct applications and b) since it is the simplest problem in the important class of Bayesian adaptive control problems. As in other such problems, even though the dynamic programming equation for the optimal expected return may be written down by inspection, it is extremely difficult to obtain explicit solutions. Concentrating on the case where the unknown parameter is beta-distributed, we establish the existence of a simple boundary curve separating the regions of unambiguous decision. Attention principally centers on the characterization and estimation of this curve. This is done with sufficient success to permit an explicit and complete solution of the problem for certain parameter ranges.

Keywords :

Adaptive control; Optimal control; Adaptive control; Bayesian methods; Dynamic programming; Educational institutions; Equations; Helium; Inspection; Mathematics; Medical treatment; System identification;

fLanguage :

English

Journal_Title :

Automatic Control, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9286

Type :

jour

DOI :

10.1109/TAC.1981.1102790

Filename :

1102790

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=837087