• DocumentCode
    1905249
  • Title

    A Model Based Reinforcement Learning Approach Using On-Line Clustering

  • Author

    Tziortziotis, N. ; Blekas, K.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Ioannina, Ioannina, Greece
  • Volume
    1
  • fYear
    2012
  • fDate
    7-9 Nov. 2012
  • Firstpage
    712
  • Lastpage
    718
  • Abstract
    A significant issue in representing reinforcement learning agents in Markov decision processes is how to design efficient feature spaces in order to estimate optimal policy. This particular study addresses this challenge by proposing a compact framework that employs an on-line clustering approach for constructing appropriate basis functions. Also, it performs a state-action trajectory analysis to gain valuable affinity information among clusters and estimate their transition dynamics. Value function approximation is used for policy evaluation in a least-squares temporal difference framework. The proposed method is evaluated in several simulated and real environments, where we took promising results.
  • Keywords
    Markov processes; function approximation; learning (artificial intelligence); least squares approximations; multi-agent systems; pattern clustering; Markov decision process; affinity information; compact framework; feature spaces; least-squares temporal difference framework; model based reinforcement learning approach; online clustering approach; optimal policy estimation; policy evaluation; reinforcement learning agents; state-action trajectory analysis; transition dynamics estimation; value function approximation; Clustering algorithms; Equations; Function approximation; Kernel; Mathematical model; Robot kinematics; clustering; mixture models; model-based reinforcement learning; on-line EM;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence (ICTAI), 2012 IEEE 24th International Conference on
  • Conference_Location
    Athens
  • ISSN
    1082-3409
  • Print_ISBN
    978-1-4799-0227-9
  • Type

    conf

  • DOI
    10.1109/ICTAI.2012.101
  • Filename
    6495113