DocumentCode :
116007
Title :
Satisficing in Gaussian bandit problems
Author :
Reverdy, Paul ; Leonard, Naomi E.
Author_Institution :
Dept. of Mech. & Aerosp. Eng., Princeton Univ., Princeton, NJ, USA
fYear :
2014
fDate :
15-17 Dec. 2014
Firstpage :
5718
Lastpage :
5723
Abstract :
We propose a satisficing objective for the multi-armed bandit problem, i.e., where the objective is to achieve performance above a given threshold. We show that this new problem is equivalent to a standard multi-armed bandit problem with a maximizing objective and use this equivalence to find bounds on performance in terms of the satisficing objective. For the special case of Gaussian rewards we show that the satisficing problem is equivalent to a related standard multi-armed bandit problem again with Gaussian rewards. We apply the Upper Credible Limit (UCL) algorithm to this standard problem and show how it achieves optimal performance in terms of the satisficing objective.
Keywords :
Gaussian processes; Gaussian bandit problems; Gaussian rewards; UCL algorithm; optimal performance; satisficing objective; standard multi-armed bandit problem; upper credible limit algorithm; Bayes methods; Context; Decision making; Inference algorithms; Probability distribution; Random variables; Standards;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Decision and Control (CDC), 2014 IEEE 53rd Annual Conference on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-1-4799-7746-8
Type :
conf
DOI :
10.1109/CDC.2014.7040284
Filename :
7040284
Link To Document :
بازگشت