• DocumentCode
    2539325
  • Title

    Reinforcement learning for multi-agent patrol policy

  • Author

    Hu, Zhaohui ; Zhao, Dongbin

  • Author_Institution
    Lab. of Complex Syst. & Intell. Sci., Chinese Acad. of Sci., Beijing, China
  • fYear
    2010
  • fDate
    7-9 July 2010
  • Firstpage
    530
  • Lastpage
    535
  • Abstract
    This paper presents a reinforcement learning (RL) algorithm for multi-agent patrol tasks, which can be thought of as a dynamic programming problem with stochastic demands. We define the cover rate as the reward, the multi-agent physical positions including edges and nodes as the state, and the nodes adjacent to the agent as the action to model the patrol task. The modeling of this problem is totally different from other´s work, which facilitates the communication and cooperation among these agents. Furthermore, we map the state from four dimensions to one dimension in order to improve the training efficiency and reduce the coding complexity. A deterministic Softmax algorithm is designed for comparison. We test both two algorithms in patrolling and rescuing scenarios. Results show the patrol cover rate with RL greatly outperforms Softmax about 15.38%, and the average rescue time for emergent pots is reduced by 20% with RL compared to Softmax.
  • Keywords
    dynamic programming; learning (artificial intelligence); military computing; multi-robot systems; Softmax algorithm; coding complexity; dynamic programming; multiagent patrol policy; reinforcement learning; Algorithm design and analysis; Indexes; Learning; Markov processes; Training; Vehicles;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-8041-8
  • Type

    conf

  • DOI
    10.1109/COGINF.2010.5599681
  • Filename
    5599681