مرکز منطقه ای اطلاع رساني علوم و فناوري - On the use of backpropagation in associative reinforcement learning

DocumentCode :

3320245

Title :

On the use of backpropagation in associative reinforcement learning

Author :

Williams, Ronald J.

Author_Institution :

Coll. of Comput. Sci., Northeastern Univ., Boston, MA, USA

fYear :

1988

fDate :

24-27 July 1988

Firstpage :

263

Abstract :

A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks. One way is to train a second network to model the environmental reinforcement signal and to backpropagate through this network into the first network. This technique has been proposed and explored previously in various forms. Another way is based on the use of the reinforce algorithm and amounts to backpropagating through deterministic parts of the network while performing a correlation-style computation where the behavior is stochastic. A third way, which is an extension of the second, allows backpropagation through the stochastic parts of the network as well. The mathematical validity of this third technique rests on the use of continuous-valued stochastic units. Some implications of this result for using supervised learning to train networks of stochastic units are noted, and it is also observed that such an approach even permits a seamless blend of associative reinforcement learning and supervised learning within the same network.<>

Keywords :

artificial intelligence; learning systems; artificial intelligence; associative reinforcement learning; backpropagation; continuous-valued stochastic units; machine learning; supervised learning; training networks; Artificial intelligence; Learning systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 1988., IEEE International Conference on

Conference_Location :

San Diego, CA, USA

Type :

conf

DOI :

10.1109/ICNN.1988.23856

Filename :

23856

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3320245