مرکز منطقه ای اطلاع رساني علوم و فناوري - A parallel architecture for temporal difference learning with eligibility traces

DocumentCode :

3268719

Title :

A parallel architecture for temporal difference learning with eligibility traces

Author :

Turnmire, J. ; Elhanany, I.

fYear :

2007

fDate :

5-8 Aug. 2007

Firstpage :

848

Lastpage :

850

Abstract :

Temporal difference learning is a central idea in reinforcement learning, commonly employed by a broad range of applications, in which there are delayed rewards. An agent learns by interacting with its environment and constructs a value function which helps map states to actions. A particularly useful tool in temporal difference learning is eligibility traces. The latter assist the agent in assigning values to states recently visited. This paper explores the gain attainable by utilizing custom hardware to take advantage of the inherent parallelism found in the TD(lambda) algorithm. The result is a scalable framework for high-speed machine learning applications. To the best of that authors´ knowledge, this is the first work that attempts to map tabular-form temporal difference learning with eligibility traces on to digital hardware.

Keywords :

learning (artificial intelligence); parallel architectures; digital hardware; eligibility trace; machine learning; parallel architecture; reinforcement learning; tabular-form temporal difference learning; Broadcasting; Hardware; Learning; Linear feedback shift registers; Logic; Parallel architectures; Parallel processing; Performance evaluation; Random sequences; State estimation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Circuits and Systems, 2007. MWSCAS 2007. 50th Midwest Symposium on

Conference_Location :

Montreal, Que.

ISSN :

1548-3746

Print_ISBN :

978-1-4244-1175-7

Electronic_ISBN :

1548-3746

Type :

conf

DOI :

10.1109/MWSCAS.2007.4488705

Filename :

4488705

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3268719