مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2995389

Title :

Two mode Q-learning

Author :

Park, Kui-Hong ; Kim, Jong-Hwan

Author_Institution :

Dept. of Electr. Eng. & Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea

Volume :

fYear :

2003

fDate :

8-12 Dec. 2003

Firstpage :

2449

Abstract :

In this paper, a new two mode Q-learning using both the success and failure experiences of an agent is proposed for the fast convergence, which extends Q-learning, a well-known scheme used for reinforcement learning. In the Q-learning, if the agent enters into the "fail" state, it receives a punishment from environment. By this punishment, the Q value of the action which generated the failure experience is decreased. On the other hand, the proposed two mode Q-learning is based on both the normal and failure Q values for the selection of the action in a state-action space. To determine the failure Q value using the previous failure experience of the agent, it employs a failure Q value module. To demonstrate the effectiveness of the proposed method, it is compared with the conventional Q-learning in a goalie system to perform goalkeeping in robot soccer.

Keywords :

learning (artificial intelligence); multi-agent systems; multi-robot systems; agent failure experience; reinforcement learning; robot soccer goalkeeping; state-action space; two mode Q-learning; Convergence; Learning; Orbital robotics; Probability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Evolutionary Computation, 2003. CEC '03. The 2003 Congress on

Print_ISBN :

0-7803-7804-0

Type :

conf

DOI :

10.1109/CEC.2003.1299395

Filename :

1299395

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2995389