مرکز منطقه ای اطلاع رساني علوم و فناوري - Strategic Choices: Small Budgets and Simple Regret

DocumentCode :

2727677

Title :

Strategic Choices: Small Budgets and Simple Regret

Author :

Cheng-Wei Chou ; Ping-Chiang Chou ; Chang-Shing Lee ; Saint-Pierre, D.L. ; Teytaud, Olivier ; Mei-Hui Wang ; Li-Wen Wu ; Shi-Jim Yen

Author_Institution :

Dept. of Comput. Sci. & Inf. Eng., NDHU, Hualian, Taiwan

fYear :

2012

fDate :

16-18 Nov. 2012

Firstpage :

182

Lastpage :

187

Abstract :

In many decision problems, there are two levels of choice: The first one is strategic and the second is tactical. We formalize the difference between both and discuss the relevance of the bandit literature for strategic decisions and test the quality of different bandit algorithms in real world examples such as board games and card games. For exploration-exploitation algorithm, we evaluate the Upper Confidence Bounds and Exponential Weights, as well as algorithms designed for simple regret, such as Successive Reject. For the exploitation, we also evaluate Bernstein Races and Uniform Sampling. As for the recommandation part, we test Empirically Best Arm, Most Played, Lower ConfidenceBounds and Empirical Distribution. In the one-player case, we recommend Upper Confidence Bound as an exploration algorithm (and in particular its variants adaptUCBE for parameter-free simple regret) and Lower Confidence Bound or Most Played Arm as recommendation algorithms. In the two-player case, we point out the commodity and efficiency of the EXP3 algorithm, and the very clear improvement provided by the truncation algorithm TEXP3. Incidentally our algorithm won some games against professional players in kill-all Go (to the best of our knowledge, for the first time in computer games).

Keywords :

decision making; game theory; optimisation; Bernstein races; EXP3 algorithm; TEXP3; bandit literature; board games; card games; decision problems; empirical distribution; empirically best arm; exploration-exploitation algorithm; exponential weights; lower confidence bounds; most played; strategic decisions; truncation algorithm; uniform sampling; upper confidence bounds; Algorithm design and analysis; Computer science; Computers; Context; Educational institutions; Games; Humans; Bandit problems; exploration policy; recommendation policy;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Technologies and Applications of Artificial Intelligence (TAAI), 2012 Conference on

Conference_Location :

Tainan

Print_ISBN :

978-1-4673-4976-5

Type :

conf

DOI :

10.1109/TAAI.2012.35

Filename :

6395027

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2727677