مرکز منطقه ای اطلاع رساني علوم و فناوري - An Othello evaluation function based on Temporal Difference Learning using probability of winning

DocumentCode :

1840814

Title :

An Othello evaluation function based on Temporal Difference Learning using probability of winning

Author :

Osaki, Yasuhiro ; Shibahara, Kazutomo ; Tajima, Yasuhiro ; Kotani, Yoshiyuki

Author_Institution :

Dept. of Comput. & Inf. Sci., Tokyo Univ. of Agric. & Technol., Koganei

fYear :

2008

fDate :

15-18 Dec. 2008

Firstpage :

205

Lastpage :

211

Abstract :

This paper presents a new reinforcement learning method, called temporal difference learning with Monte Carlo simulation (TDMC), which uses a combination of Temporal Difference Learning (TD) and winning probability in each non-terminal position. Studies on self-teaching evaluation functions as applied to logic games have been conducted for many years, however few successful results of employing TD have been reported. This is perhaps due to the fact that the only reward observable in logic games is their final outcome, with no obvious rewards present in non-terminal positions. TDMC(lambda) attempts to compensate this problem by introducing winning probabilities, obtained through Monte Carlo simulation, as substitute rewards. Using Othello as a testing environment, TDMC(lambda), in comparison to TD(lambda), has been seen to yield better learning results.

Keywords :

Monte Carlo methods; computer games; learning (artificial intelligence); Monte Carlo simulation; Othello evaluation function; logic games; reinforcement learning method; self-teaching evaluation functions; temporal difference learning; winning probabilities; Agriculture; Computational modeling; Educational institutions; Learning systems; Logic; Optimization methods; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computational Intelligence and Games, 2008. CIG '08. IEEE Symposium On

Conference_Location :

Perth, WA

Print_ISBN :

978-1-4244-2973-8

Electronic_ISBN :

978-1-4244-2974-5

Type :

conf

DOI :

10.1109/CIG.2008.5035641

Filename :

5035641

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1840814