DocumentCode :
493373
Title :
The knowledge gradient algorithm for online subset selection
Author :
Ryzhov, Ilya O. ; Powell, Warren
Author_Institution :
Dept. of Oper. Res. & Financial Eng., Princeton Univ., Princeton, NJ
fYear :
2009
fDate :
March 30 2009-April 2 2009
Firstpage :
137
Lastpage :
144
Abstract :
We derive a one-period look-ahead policy for online subset selection problems, where learning about one subset also gives us information about other subsets. The subset selection problem is treated as a multi-armed bandit problem with correlated prior beliefs. We show that our decision rule is easily computable, and present experimental evidence that the policy is competitive against other online learning policies.
Keywords :
learning (artificial intelligence); mathematical analysis; knowledge gradient algorithm; multiarmed bandit problem; one-period look-ahead policy; online learning policies; online subset selection; online subset selection problems; Contracts; Costs; Drugs; Energy management; Heating; Insulation; Medical treatment; Pharmaceutical technology; Portfolios; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2761-1
Type :
conf
DOI :
10.1109/ADPRL.2009.4927537
Filename :
4927537
Link To Document :
بازگشت