Title :
Stochastic online learning under unknown time-varying models
Author :
Tehrani, P. ; Qing Zhao
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, Davis, Davis, CA, USA
Abstract :
An online learning problem under stochastic time-varying models is considered. The problem is treated as a generalization of the classic multi-armed bandit problem when the arm distributions are time-varying. The objective is to study the impact of time variation in arm distributions on the performance of the player´s strategy. Sufficient conditions on the rate of model variations under which learning can or cannot improve the regret order are established.
Keywords :
game theory; learning (artificial intelligence); stochastic processes; time-varying systems; classic multiarmed bandit problem generalization; player strategy performance; regret order; stochastic online learning problem; stochastic time-varying models; sufficient condition; time-varying arm distributions; unknown time-varying model; Multi-armed bandit; online learning; time-varying models;
Conference_Titel :
Signals, Systems and Computers (ASILOMAR), 2012 Conference Record of the Forty Sixth Asilomar Conference on
Conference_Location :
Pacific Grove, CA
Print_ISBN :
978-1-4673-5050-1
DOI :
10.1109/ACSSC.2012.6489178