DocumentCode
1562137
Title
Automatic Discovery of Subgoals in Reinforcement Learning Using Unique-Dreiction Value
Author
Chuan Shi ; Rui Huang ; Zhongzhi Shi
Author_Institution
Chinese Acad. of Sci., Beijing
fYear
2007
Firstpage
480
Lastpage
486
Abstract
Option has proven useful in discovering hierarchical structure in reinforcement learning to fasten learning. The key problem of automatic option discovery is to find subgoals. Though approaches based on visiting-frequency have gained much research focuses, many of them fail to distinguish subgoals from their nearby states. Based on the action-restricted property of subgoals we find, subgoals can be regarded as the most matching action-restricted states in the paths. For the grid-world environment, the concept of unique-direction value embodying the action-restricted property is introduced to find the most matching action-restricted states. Experiment results prove that the proposed approach can find subgoals correctly and the Q-learning with options found speed up the learning greatly.
Keywords
learning (artificial intelligence); Q-learning; action-restricted property; automatic discovery; reinforcement learning; unique-direction value; Autonomous agents; Cognitive informatics; Information analysis; Large-scale systems; Learning; State-space methods; Telecommunication computing; Hierarchical reinforcement learning option; Q learning; subgoal; unique-direction value;
fLanguage
English
Publisher
ieee
Conference_Titel
Cognitive Informatics, 6th IEEE International Conference on
Conference_Location
Lake Tahoo, CA
Print_ISBN
9781-4244-1327-0
Electronic_ISBN
978-1-4244-1328-7
Type
conf
DOI
10.1109/COGINF.2007.4341927
Filename
4341927
Link To Document