Title :
Reinforcement learning for multi-agent patrol policy
Author :
Hu, Zhaohui ; Zhao, Dongbin
Author_Institution :
Lab. of Complex Syst. & Intell. Sci., Chinese Acad. of Sci., Beijing, China
Abstract :
This paper presents a reinforcement learning (RL) algorithm for multi-agent patrol tasks, which can be thought of as a dynamic programming problem with stochastic demands. We define the cover rate as the reward, the multi-agent physical positions including edges and nodes as the state, and the nodes adjacent to the agent as the action to model the patrol task. The modeling of this problem is totally different from other´s work, which facilitates the communication and cooperation among these agents. Furthermore, we map the state from four dimensions to one dimension in order to improve the training efficiency and reduce the coding complexity. A deterministic Softmax algorithm is designed for comparison. We test both two algorithms in patrolling and rescuing scenarios. Results show the patrol cover rate with RL greatly outperforms Softmax about 15.38%, and the average rescue time for emergent pots is reduced by 20% with RL compared to Softmax.
Keywords :
dynamic programming; learning (artificial intelligence); military computing; multi-robot systems; Softmax algorithm; coding complexity; dynamic programming; multiagent patrol policy; reinforcement learning; Algorithm design and analysis; Indexes; Learning; Markov processes; Training; Vehicles;
Conference_Titel :
Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-8041-8
DOI :
10.1109/COGINF.2010.5599681