Title :
The knowledge gradient algorithm for online subset selection
Author :
Ryzhov, Ilya O. ; Powell, Warren
Author_Institution :
Dept. of Oper. Res. & Financial Eng., Princeton Univ., Princeton, NJ
fDate :
March 30 2009-April 2 2009
Abstract :
We derive a one-period look-ahead policy for online subset selection problems, where learning about one subset also gives us information about other subsets. The subset selection problem is treated as a multi-armed bandit problem with correlated prior beliefs. We show that our decision rule is easily computable, and present experimental evidence that the policy is competitive against other online learning policies.
Keywords :
learning (artificial intelligence); mathematical analysis; knowledge gradient algorithm; multiarmed bandit problem; one-period look-ahead policy; online learning policies; online subset selection; online subset selection problems; Contracts; Costs; Drugs; Energy management; Heating; Insulation; Medical treatment; Pharmaceutical technology; Portfolios; Testing;
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2761-1
DOI :
10.1109/ADPRL.2009.4927537