DocumentCode :
3185388
Title :
An adaptive state aggregation approach to Q-learning with real-valued action function
Author :
Hwang, Kao-Shing ; Chen, Yu-Jen
Author_Institution :
Electr. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
fYear :
2010
fDate :
10-13 Oct. 2010
Firstpage :
164
Lastpage :
170
Abstract :
The fundamental approach of Q-learning is based on finite discrete state spaces, and incrementally estimating Q-values based on the reward received from the environment and the agent´s previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as states. Nowadays, there is no elegant way to combine discrete actions with continuous observations or states. Therefore, accommodating continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. We proposed an algorithm to define an action policy from a discrete space to a real valued domain; that is, the method selects a real-valued action from a discrete set, the magnitude of which is immediately imposed a slight bias before this determined action is taken. From the viewpoint of exploration and exploitation, the method searches for a better action based on a paradigm action in the solution space with a variation within the biased region. Further, the proposed method uses the renown epsilon-greedy to explore a better trace but with a narrowized Tabu search.
Keywords :
decision trees; function approximation; learning (artificial intelligence); search problems; Q-learning; Q-value estimation; adaptive state aggregation approach; discrete actions; epsilon greedy; finite discrete set; finite discrete state spaces; narrowized Tabu search; real valued action function;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1062-922X
Print_ISBN :
978-1-4244-6586-6
Type :
conf
DOI :
10.1109/ICSMC.2010.5642228
Filename :
5642228
Link To Document :
بازگشت