مرکز منطقه ای اطلاع رساني علوم و فناوري - The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm

DocumentCode :

681559

Title :

The optimization of path planning for multi-robot system using Boltzmann Policy based Q-learning algorithm

Author :

Zeying Wang ; Zhiguo Shi ; Yuankai Li ; Jun Tu

Author_Institution :

Sch. of Comput. & Commun. Eng., Univ. of Sci. & Technol., Beijing, China

fYear :

2013

fDate :

12-14 Dec. 2013

Firstpage :

1199

Lastpage :

1204

Abstract :

Path planning is a fundamental method in solving mazes or moving robots traversing through open fields with obstacles. Q-learning method is a model-independent reinforcement learning method, which can be utilized in path planning optimization for robots in a multi-robot collaboration system. However, the efficiency of the traditional Q-learning algorithm is relatively low because of the adopted random exploration policy. In this paper, a Boltzmann Policy based Q-learning algorithm is proposed and applied into the problem of path planning optimization of a Multi-robot system. The method is composed of two parts, which are Q-learning and Boltzmann policy. Q-learning is a grid-based algorithm that can solve the low-dimensional path planning problems. Boltzmann Policy adopts statistical probability and simulated annealing, so it can help to avoid trapping in local optimum and provide global optimal solution. Player/Stage is used to evaluate the performance, which shows that the proposed Q-learning algorithm based on the Boltzmann policy can remarkably improve the efficiency of the multi-robot System, reducing the number of explorations and converging the process.

Keywords :

Boltzmann machines; control engineering computing; digital simulation; learning (artificial intelligence); multi-robot systems; path planning; probability; simulated annealing; statistical analysis; Boltzmann policy based Q-learning algorithm; Player-Stage; grid-based algorithm; local optimum; low-dimensional path planning problems; model-independent reinforcement learning method; multirobot collaboration system; path planning optimization; random exploration policy; simulated annealing; statistical probability; trapping avoidance; Convergence; Intelligent agents; Learning (artificial intelligence); Optimization; Path planning; Robot sensing systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Robotics and Biomimetics (ROBIO), 2013 IEEE International Conference on

Conference_Location :

Shenzhen

Type :

conf

DOI :

10.1109/ROBIO.2013.6739627

Filename :

6739627

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=681559