DocumentCode :
3495859
Title :
Discovery of pattern meaning from delayed rewards by reinforcement learning with a recurrent neural network
Author :
Shibata, Katsunari ; Utsunomiya, Hiroki
Author_Institution :
Dept. of Electr. & Electron. Eng., Oita Univ., Oita, Japan
fYear :
2011
fDate :
July 31 2011-Aug. 5 2011
Firstpage :
1445
Lastpage :
1452
Abstract :
In this paper, by the combination of reinforcement learning and a recurrent neural network, the authors try to provide an explanation for the question: why humans can discover the meaning of patterns and acquire appropriate behaviors based on it. Using a system with a real movable camera, it is demonstrated in a simple task in which the system discovers pattern meaning from delayed rewards by reinforcement learning with a recurrent neural network. When the system moves its camera to the direction of an arrow presented on a display, it can get a reward. One kind of arrow is chosen randomly among four kinds at each episode, and the input of the network is 1,560 visual signals from the camera. After learning, the system could move its camera to the arrow direction. It was found that some hidden neurons represented the arrow direction not depending on the presented arrow pattern and kept it after the arrow disappeared from the image, even though no arrow was seen when it was rewarded and no one told the system that the arrow direction is important to get the reward. Generalization to some new arrow patterns and associative memory function also can be seen to some extent.
Keywords :
learning (artificial intelligence); recurrent neural nets; arrow direction; delayed rewards; pattern meaning discovery; recurrent neural network; reinforcement learning; visual signals; Cameras; Green products; Learning; Neurons; Recurrent neural networks; Training; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
ISSN :
2161-4393
Print_ISBN :
978-1-4244-9635-8
Type :
conf
DOI :
10.1109/IJCNN.2011.6033394
Filename :
6033394
Link To Document :
بازگشت