DocumentCode
3726541
Title
Correlated Gaussian Multi-Objective Multi-Armed Bandit Across Arms Algorithm
Author
Saba Q. Yahyaa;Madalina M. Drugan
Author_Institution
Dept. of Comput. Sci., Vrije Univ. Brussel, Brussels, Belgium
fYear
2015
Firstpage
593
Lastpage
600
Abstract
Stochastic multi-objective multi-armed bandit problem, (MOMAB), is a stochastic multi-armed problem where each arm generates a vector of rewards instead of a single scalar reward. The goal of (MOMAB) is to minimize the regret of playing suboptimal arms while playing fairly the Pareto optimal arms. In this paper, we consider Gaussian correlation across arms in (MOMAB), meaning that the generated reward vector of an arm gives us information not only about that arm itself but also on all the available arms. We call this framework the correlated-MOMAB problem. We extended Gittins index policy to correlated (MOMAB) because Gittins index has been used before to model the correlation between arms. We empirically compared Gittins index policy with multi-objective upper confidence bound policy on a test suite of correlated-MOMAB problems. We conclude that the performance of these policies depend on the number of arms and objectives.
Keywords
"Indexes","Pareto optimization","Gaussian distribution","Correlation","Stochastic processes","Probability distribution"
Publisher
ieee
Conference_Titel
Computational Intelligence, 2015 IEEE Symposium Series on
Print_ISBN
978-1-4799-7560-0
Type
conf
DOI
10.1109/SSCI.2015.93
Filename
7376666
Link To Document