DocumentCode :
465502
Title :
A Symmetric Multiprocessor Architecture for Multi-Agent Temporal Difference Learning
Author :
Fields, Scott ; Elhanany, Itamar
Author_Institution :
Student Member, IEEE, Department of Electrical & Computer Engineering, The University of Tennessee, Knoxville, TN 37922. sfields1@utk.edu
Volume :
1
fYear :
2006
fDate :
6-9 Aug. 2006
Firstpage :
505
Lastpage :
509
Abstract :
Temporal difference learning methods have been successfully applied to a wide range of stochastic learning and control problems. In addition to correctness, one metric of a technique´s performance is its learning rate - the number of iterations required to converge to an optimal solution. The learning rate can be increased by using multiple agents that can share experience. In a software environment, the potential speedup from additional agents is limited, since adding agents significantly increases the burden of computation and/or hinders real-time processing. To address this problem, this paper presents a parameterized hardware model of a multi-agent system based on a shared-memory Symmetric Multiprocessor (SMP). To the author´s knowledge, this is the first application of an SMP architecture to a multi-agent reinforcement learning system. The control model employed is a multi-agent variation of the Sarsa(¿) algorithm. Several hardware optimizations schemes are investigated with respect to feasibility and expected performance. The system is modeled using a cycle-accurate simulation in SystemC. The results indicate that real-time learning rates can be significantly improved by employing the proposed parallel hardware implementation.
Keywords :
Computational modeling; Computer architecture; Computer interfaces; Concurrent computing; Costs; Hardware; Learning systems; Message passing; Multiagent systems; Stochastic processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits and Systems, 2006. MWSCAS '06. 49th IEEE International Midwest Symposium on
Conference_Location :
San Juan, PR
ISSN :
1548-3746
Print_ISBN :
1-4244-0172-0
Electronic_ISBN :
1548-3746
Type :
conf
DOI :
10.1109/MWSCAS.2006.382109
Filename :
4267186
Link To Document :
بازگشت