Title :
Automatic Discovery of Subgoals in Reinforcement Learning Using Unique-Dreiction Value
Author :
Chuan Shi ; Rui Huang ; Zhongzhi Shi
Author_Institution :
Chinese Acad. of Sci., Beijing
Abstract :
Option has proven useful in discovering hierarchical structure in reinforcement learning to fasten learning. The key problem of automatic option discovery is to find subgoals. Though approaches based on visiting-frequency have gained much research focuses, many of them fail to distinguish subgoals from their nearby states. Based on the action-restricted property of subgoals we find, subgoals can be regarded as the most matching action-restricted states in the paths. For the grid-world environment, the concept of unique-direction value embodying the action-restricted property is introduced to find the most matching action-restricted states. Experiment results prove that the proposed approach can find subgoals correctly and the Q-learning with options found speed up the learning greatly.
Keywords :
learning (artificial intelligence); Q-learning; action-restricted property; automatic discovery; reinforcement learning; unique-direction value; Autonomous agents; Cognitive informatics; Information analysis; Large-scale systems; Learning; State-space methods; Telecommunication computing; Hierarchical reinforcement learning option; Q learning; subgoal; unique-direction value;
Conference_Titel :
Cognitive Informatics, 6th IEEE International Conference on
Conference_Location :
Lake Tahoo, CA
Print_ISBN :
9781-4244-1327-0
Electronic_ISBN :
978-1-4244-1328-7
DOI :
10.1109/COGINF.2007.4341927