Title :
Sequential
-Learning With Kalman Filtering for Multirobot Cooperative Transportation
Author :
Wang, Ying ; De Silva, Clarence W.
Author_Institution :
Dept. of Mech. Eng., Univ. of British Columbia, Vancouver, BC, Canada
fDate :
4/1/2010 12:00:00 AM
Abstract :
This paper presents a modified, distributed Q-learning algorithm, termed as sequential Q-learning with Kalman filtering (SQKF), for decision making associated with multirobot cooperation. The SQKF algorithm developed here has the following characteristics. 1) The learning process is arranged in a sequential manner (i.e., the robots will not make decisions simultaneously, but in a predefined sequence) so as to promote cooperation among robots and reduce their Q-learning spaces. 2) A robot will not update its Q-values with observed global rewards. Instead, it will employ a specific Kalman filter to extract its real local reward from the global reward, thereby updating its Q-table with this local reward. The new SQKF algorithm is intended to solve two problems in multirobot Q-learning: credit assignment and behavior conflicts. The detailed procedure of the SQKF algorithm is presented, and its application is illustrated using a prototype multirobot experimental system. The experimental results show that the algorithm has better performance than the conventional single-agent Q-learning algorithm or the team Q-learning algorithm in the multirobot domain.
Keywords :
Kalman filters; decision making; feature extraction; learning (artificial intelligence); multi-robot systems; behavior conflicts; credit assignment; decision making; multirobot cooperative transportation; sequential Q-learning with Kalman filtering; $Q$-learning; Decision making; multirobot systems;
Journal_Title :
Mechatronics, IEEE/ASME Transactions on
DOI :
10.1109/TMECH.2009.2024681