DocumentCode :
3041870
Title :
Effect of Virtual Work Braking on Distributed Multi-robot Reinforcement Learning
Author :
Kawano, Hiroyuki
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kanagawa, Japan
fYear :
2013
fDate :
13-16 Oct. 2013
Firstpage :
1987
Lastpage :
1994
Abstract :
Multi-agent reinforcement learning (MARL) is one of the most promising methods for solving the problem of multi-robot control. One approach for MARL is cooperative Q-learning (CoQ), which uses learning state space containing states and actions of all agents. Inspite of the mathematical foundation for learning convergence, CoQ often suffers from a state space explosion caused by the increase in the number of agents. Another approach to MARL is distributed Q-learning (DiQ), in which each agent uses learning state space not containing the states and actions of other agents. The state space for DiQ can easily be kept compact. Therefore, DiQ seems suitable for solving multi-robot control problems. However, there is no mathematical guarantee for learning convergence in DiQ and it is difficult to apply DiQ to a multi-robot control problem in which definite appointments among working robots must be considered for accomplishing a mission. To solve these problems in applying DiQ for multi-robot control, we treat the work operated by robots as a new agent that regulates robots´ motion. We assume that the work has braking ability for its motion. The work stops its motion when the robot attempts to push the work in an inappropriate direction. The policy for the work braking is obtained via dynamic programming of a Markov decision process by using a map of the environment and the work´s geometry. By virtue of this, DiQ without joint state space shows convergence. Simulation results also show the high performance of the proposed method in learning speed.
Keywords :
Markov processes; braking; convergence; decision theory; dynamic programming; learning (artificial intelligence); multi-agent systems; multi-robot systems; problem solving; CoQ; DiQ; MARL; Markov decision process; cooperative Q-learning; distributed Q-learning; distributed multi-robot reinforcement learning; dynamic programming; learning convergence; learning state space; multiagent reinforcement learning; multirobot control problem; problem solving; robot motion regulation; state space explosion; virtual work braking; working robots; Aerospace electronics; Convergence; Joints; Learning (artificial intelligence); Robots; Space missions; Trajectory; distributed system; embodiment; multi-agent learning; reinforcement learning; robotics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
Conference_Location :
Manchester
Type :
conf
DOI :
10.1109/SMC.2013.341
Filename :
6722094
Link To Document :
بازگشت