Title :
An adaptive state aggregation approach to Q-learning with real-valued action function
Author :
Hwang, Kao-Shing ; Chen, Yu-Jen
Author_Institution :
Electr. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
Abstract :
The fundamental approach of Q-learning is based on finite discrete state spaces, and incrementally estimating Q-values based on the reward received from the environment and the agent´s previous Q-value estimates. Unfortunately, robots always learn and behave in a continuous perceptual space where the observed perceptions are transformed into or coarsely regarded as states. Nowadays, there is no elegant way to combine discrete actions with continuous observations or states. Therefore, accommodating continuous states with a finite discrete set of actions has become an important and intriguing issue in this research area. We proposed an algorithm to define an action policy from a discrete space to a real valued domain; that is, the method selects a real-valued action from a discrete set, the magnitude of which is immediately imposed a slight bias before this determined action is taken. From the viewpoint of exploration and exploitation, the method searches for a better action based on a paradigm action in the solution space with a variation within the biased region. Further, the proposed method uses the renown epsilon-greedy to explore a better trace but with a narrowized Tabu search.
Keywords :
decision trees; function approximation; learning (artificial intelligence); search problems; Q-learning; Q-value estimation; adaptive state aggregation approach; discrete actions; epsilon greedy; finite discrete set; finite discrete state spaces; narrowized Tabu search; real valued action function;
Conference_Titel :
Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-6586-6
DOI :
10.1109/ICSMC.2010.5642228