Title :
The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm
Author :
Zeying Wang ; Zhiguo Shi ; Yuankai Li ; Jun Tu
Author_Institution :
Sch. of Comput. & Commun. Eng., Univ. of Sci. & Technol., Beijing, China
Abstract :
Path planning is a fundamental method in solving mazes or moving robots traversing through open fields with obstacles. Q-learning method is a model-independent reinforcement learning method, which can be utilized in path planning optimization for robots in a multi-robot collaboration system. However, the efficiency of the traditional Q-learning algorithm is relatively low because of the adopted random exploration policy. In this paper, a Boltzmann Policy based Q-learning algorithm is proposed and applied into the problem of path planning optimization of a Multi-robot system. The method is composed of two parts, which are Q-learning and Boltzmann policy. Q-learning is a grid-based algorithm that can solve the low-dimensional path planning problems. Boltzmann Policy adopts statistical probability and simulated annealing, so it can help to avoid trapping in local optimum and provide global optimal solution. Player/Stage is used to evaluate the performance, which shows that the proposed Q-learning algorithm based on the Boltzmann policy can remarkably improve the efficiency of the multi-robot System, reducing the number of explorations and converging the process.
Keywords :
Boltzmann machines; control engineering computing; digital simulation; learning (artificial intelligence); multi-robot systems; path planning; probability; simulated annealing; statistical analysis; Boltzmann policy based Q-learning algorithm; Player-Stage; grid-based algorithm; local optimum; low-dimensional path planning problems; model-independent reinforcement learning method; multirobot collaboration system; path planning optimization; random exploration policy; simulated annealing; statistical probability; trapping avoidance; Convergence; Intelligent agents; Learning (artificial intelligence); Optimization; Path planning; Robot sensing systems;
Conference_Titel :
Robotics and Biomimetics (ROBIO), 2013 IEEE International Conference on
Conference_Location :
Shenzhen
DOI :
10.1109/ROBIO.2013.6739627