Title :
Mean-variance and value at risk in multi-armed bandit problems
Author :
Sattar Vakili;Qing Zhao
Author_Institution :
School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14850, United States
Abstract :
We study risk-averse multi-armed bandit problems under different risk measures. We consider three risk mitigation models. In the first model, the variations in the reward values obtained at different times are considered as risk and the objective is to minimize the mean-variance of the observed rewards. In the second and the third models, the quantity of interest is the total reward at the end of the time horizon, and the objective is to minimize the mean-variance and maximize the value at risk of the total reward, respectively. We develop risk-averse online learning policies and analyze their regret performance. We also provide tight lower bounds on regret under the model of mean-variance of observations.
Keywords :
"Biological system modeling","Reactive power","Risk management","Random variables","Investment","Computational modeling","Decision making"
Conference_Titel :
Communication, Control, and Computing (Allerton), 2015 53rd Annual Allerton Conference on
DOI :
10.1109/ALLERTON.2015.7447162