• DocumentCode
    1562137
  • Title

    Automatic Discovery of Subgoals in Reinforcement Learning Using Unique-Dreiction Value

  • Author

    Chuan Shi ; Rui Huang ; Zhongzhi Shi

  • Author_Institution
    Chinese Acad. of Sci., Beijing
  • fYear
    2007
  • Firstpage
    480
  • Lastpage
    486
  • Abstract
    Option has proven useful in discovering hierarchical structure in reinforcement learning to fasten learning. The key problem of automatic option discovery is to find subgoals. Though approaches based on visiting-frequency have gained much research focuses, many of them fail to distinguish subgoals from their nearby states. Based on the action-restricted property of subgoals we find, subgoals can be regarded as the most matching action-restricted states in the paths. For the grid-world environment, the concept of unique-direction value embodying the action-restricted property is introduced to find the most matching action-restricted states. Experiment results prove that the proposed approach can find subgoals correctly and the Q-learning with options found speed up the learning greatly.
  • Keywords
    learning (artificial intelligence); Q-learning; action-restricted property; automatic discovery; reinforcement learning; unique-direction value; Autonomous agents; Cognitive informatics; Information analysis; Large-scale systems; Learning; State-space methods; Telecommunication computing; Hierarchical reinforcement learning option; Q learning; subgoal; unique-direction value;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cognitive Informatics, 6th IEEE International Conference on
  • Conference_Location
    Lake Tahoo, CA
  • Print_ISBN
    9781-4244-1327-0
  • Electronic_ISBN
    978-1-4244-1328-7
  • Type

    conf

  • DOI
    10.1109/COGINF.2007.4341927
  • Filename
    4341927