Scaling Up Multi-agent Reinforcement Learning in Complex Domains

Author

Xiao, Dan ; Tan, Ah-Hwee

Author_Institution

Sch. of Comput. Eng. & Intell. Syst. Centre, Nanyang Technol. Univ., Singapore

Volume

2

fYear

2008

fDate

9-12 Dec. 2008

Firstpage

326

Lastpage

329

Abstract

TD-FALCON (temporal difference-fusion architecture for learning, cognition, and navigation) is a class of self-organizing neural networks that incorporates temporal difference (TD) methods for real-time reinforcement learning. In this paper, we present two strategies, i.e. policy sharing and neighboring-agent mechanism, to further improve the learning efficiency of TD-FALCON in complex multi-agent domains. Through experiments on a traffic control problem domain and the herding task, we demonstrate that those strategies enable TD-FALCON to remain functional and adaptable in complex multi-agent domains.

Keywords

learning (artificial intelligence); multi-agent systems; neurocontrollers; road traffic; self-organising feature maps; traffic control; TD-FALCON; cognition; learning fusion architecture; multiagent reinforcement learning; navigation; neighboring-agent mechanism; policy sharing; self-organizing neural networks; temporal difference methods; traffic control; Cognition; Intelligent agent; Intelligent networks; Learning; Navigation; Resonance; State estimation; State feedback; Subspace constraints; Traffic control; Multi-Agent Reinforcement Learning; TD-FALCON; neighboring-agent mechanism; policy sharing;

fLanguage

English

Publisher

ieee

Conference_Titel

Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on

Conference_Location

Sydney, NSW

Print_ISBN

978-0-7695-3496-1

Type

conf

DOI

10.1109/WIIAT.2008.259

Filename

4740643