DocumentCode
2539325
Title
Reinforcement learning for multi-agent patrol policy
Author
Hu, Zhaohui ; Zhao, Dongbin
Author_Institution
Lab. of Complex Syst. & Intell. Sci., Chinese Acad. of Sci., Beijing, China
fYear
2010
fDate
7-9 July 2010
Firstpage
530
Lastpage
535
Abstract
This paper presents a reinforcement learning (RL) algorithm for multi-agent patrol tasks, which can be thought of as a dynamic programming problem with stochastic demands. We define the cover rate as the reward, the multi-agent physical positions including edges and nodes as the state, and the nodes adjacent to the agent as the action to model the patrol task. The modeling of this problem is totally different from other´s work, which facilitates the communication and cooperation among these agents. Furthermore, we map the state from four dimensions to one dimension in order to improve the training efficiency and reduce the coding complexity. A deterministic Softmax algorithm is designed for comparison. We test both two algorithms in patrolling and rescuing scenarios. Results show the patrol cover rate with RL greatly outperforms Softmax about 15.38%, and the average rescue time for emergent pots is reduced by 20% with RL compared to Softmax.
Keywords
dynamic programming; learning (artificial intelligence); military computing; multi-robot systems; Softmax algorithm; coding complexity; dynamic programming; multiagent patrol policy; reinforcement learning; Algorithm design and analysis; Indexes; Learning; Markov processes; Training; Vehicles;
fLanguage
English
Publisher
ieee
Conference_Titel
Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-8041-8
Type
conf
DOI
10.1109/COGINF.2010.5599681
Filename
5599681
Link To Document