• DocumentCode
    3476897
  • Title

    Using the online cross-entropy method to learn relational policies for playing different games

  • Author

    Sarjant, S. ; Pfahringer, Bernhard ; Driessens, K. ; Smith, Tim

  • Author_Institution
    Fac. of Comput. & Math. Sci., Univ. of Waikato, Hamilton, New Zealand
  • fYear
    2011
  • fDate
    Aug. 31 2011-Sept. 3 2011
  • Firstpage
    182
  • Lastpage
    189
  • Abstract
    By defining a video-game environment as a collection of objects, relations, actions and rewards, the relational reinforcement learning algorithm presented in this paper generates and optimises a set of concise, human-readable relational rules for achieving maximal reward. Rule learning is achieved using a combination of incremental specialisation of rules and a modified online cross-entropy method, which dynamically adjusts the rate of learning as the agent progresses. The algorithm is tested on the Ms. Pac-Man and Mario environments, with results indicating the agent learns an effective policy for acting within each environment.
  • Keywords
    computer games; learning (artificial intelligence); video signal processing; Mario environments; Ms. Pac-Man; human readable relational rules; online cross entropy method; playing different games; reinforcement learning algorithm; relational policies; video game environment; Computational intelligence; Conferences; Games; Heuristic algorithms; Junctions; Learning; Learning systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Games (CIG), 2011 IEEE Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    978-1-4577-0010-1
  • Electronic_ISBN
    978-1-4577-0009-5
  • Type

    conf

  • DOI
    10.1109/CIG.2011.6032005
  • Filename
    6032005