DocumentCode :
2727677
Title :
Strategic Choices: Small Budgets and Simple Regret
Author :
Cheng-Wei Chou ; Ping-Chiang Chou ; Chang-Shing Lee ; Saint-Pierre, D.L. ; Teytaud, Olivier ; Mei-Hui Wang ; Li-Wen Wu ; Shi-Jim Yen
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., NDHU, Hualian, Taiwan
fYear :
2012
fDate :
16-18 Nov. 2012
Firstpage :
182
Lastpage :
187
Abstract :
In many decision problems, there are two levels of choice: The first one is strategic and the second is tactical. We formalize the difference between both and discuss the relevance of the bandit literature for strategic decisions and test the quality of different bandit algorithms in real world examples such as board games and card games. For exploration-exploitation algorithm, we evaluate the Upper Confidence Bounds and Exponential Weights, as well as algorithms designed for simple regret, such as Successive Reject. For the exploitation, we also evaluate Bernstein Races and Uniform Sampling. As for the recommandation part, we test Empirically Best Arm, Most Played, Lower ConfidenceBounds and Empirical Distribution. In the one-player case, we recommend Upper Confidence Bound as an exploration algorithm (and in particular its variants adaptUCBE for parameter-free simple regret) and Lower Confidence Bound or Most Played Arm as recommendation algorithms. In the two-player case, we point out the commodity and efficiency of the EXP3 algorithm, and the very clear improvement provided by the truncation algorithm TEXP3. Incidentally our algorithm won some games against professional players in kill-all Go (to the best of our knowledge, for the first time in computer games).
Keywords :
decision making; game theory; optimisation; Bernstein races; EXP3 algorithm; TEXP3; bandit literature; board games; card games; decision problems; empirical distribution; empirically best arm; exploration-exploitation algorithm; exponential weights; lower confidence bounds; most played; strategic decisions; truncation algorithm; uniform sampling; upper confidence bounds; Algorithm design and analysis; Computer science; Computers; Context; Educational institutions; Games; Humans; Bandit problems; exploration policy; recommendation policy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Technologies and Applications of Artificial Intelligence (TAAI), 2012 Conference on
Conference_Location :
Tainan
Print_ISBN :
978-1-4673-4976-5
Type :
conf
DOI :
10.1109/TAAI.2012.35
Filename :
6395027
Link To Document :
بازگشت