DocumentCode
3476897
Title
Using the online cross-entropy method to learn relational policies for playing different games
Author
Sarjant, S. ; Pfahringer, Bernhard ; Driessens, K. ; Smith, Tim
Author_Institution
Fac. of Comput. & Math. Sci., Univ. of Waikato, Hamilton, New Zealand
fYear
2011
fDate
Aug. 31 2011-Sept. 3 2011
Firstpage
182
Lastpage
189
Abstract
By defining a video-game environment as a collection of objects, relations, actions and rewards, the relational reinforcement learning algorithm presented in this paper generates and optimises a set of concise, human-readable relational rules for achieving maximal reward. Rule learning is achieved using a combination of incremental specialisation of rules and a modified online cross-entropy method, which dynamically adjusts the rate of learning as the agent progresses. The algorithm is tested on the Ms. Pac-Man and Mario environments, with results indicating the agent learns an effective policy for acting within each environment.
Keywords
computer games; learning (artificial intelligence); video signal processing; Mario environments; Ms. Pac-Man; human readable relational rules; online cross entropy method; playing different games; reinforcement learning algorithm; relational policies; video game environment; Computational intelligence; Conferences; Games; Heuristic algorithms; Junctions; Learning; Learning systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Games (CIG), 2011 IEEE Conference on
Conference_Location
Seoul
Print_ISBN
978-1-4577-0010-1
Electronic_ISBN
978-1-4577-0009-5
Type
conf
DOI
10.1109/CIG.2011.6032005
Filename
6032005
Link To Document