Title :
Improving the performance of Q-learning using simultanouse Q-values updating
Author :
Pouyan, Maryam ; Mousavi, Amin ; Golzari, Shahram ; Hatam, Ahmad
Author_Institution :
Electr. & Comput. Eng. Dept., Hormozgan Univ., Bandarabbas, Iran
Abstract :
Q-learning is a one of the best model-free reinforcement learning algorithms. The goal is to find an estimate of the optimal action-value function called Q-value function. The Q-value function is defined as the expected sum of future rewards obtained by taking an action in the current state. The main drawback of Q-learning is that the learning process is expensive for the agent, specially, in the beginning steps. Because, every state-action pair should be visited frequently in order to converge to the optimal policy. In this paper, the concept of opposite action is used to improve the performance of the Q-learning algorithm, especially, in the beginning steps of the learning. Opposite actions suggest updating two Q-values, simultaneously. The agent will update Q-value for each action and corresponding opposite action and thus increasing the speed of learning. The novel Q-learning method based on the concept of opposite action is simulated for the famous test-bed grid world problem. The results show the ability of the proposed method to improve the learning process.
Keywords :
learning (artificial intelligence); optimisation; Q-learning; Q-value function; optimal action-value function; reinforcement learning algorithm; Computational intelligence; Computers; Convergence; Educational institutions; Knowledge engineering; Learning (artificial intelligence); Standards; Q-leaming; estimate value; opposite action; reinforcement learning;
Conference_Titel :
Technology, Communication and Knowledge (ICTCK), 2014 International Congress on
Conference_Location :
Mashhad
DOI :
10.1109/ICTCK.2014.7033528