Title :
Discovery of pattern meaning from delayed rewards by reinforcement learning with a recurrent neural network
Author :
Shibata, Katsunari ; Utsunomiya, Hiroki
Author_Institution :
Dept. of Electr. & Electron. Eng., Oita Univ., Oita, Japan
fDate :
July 31 2011-Aug. 5 2011
Abstract :
In this paper, by the combination of reinforcement learning and a recurrent neural network, the authors try to provide an explanation for the question: why humans can discover the meaning of patterns and acquire appropriate behaviors based on it. Using a system with a real movable camera, it is demonstrated in a simple task in which the system discovers pattern meaning from delayed rewards by reinforcement learning with a recurrent neural network. When the system moves its camera to the direction of an arrow presented on a display, it can get a reward. One kind of arrow is chosen randomly among four kinds at each episode, and the input of the network is 1,560 visual signals from the camera. After learning, the system could move its camera to the arrow direction. It was found that some hidden neurons represented the arrow direction not depending on the presented arrow pattern and kept it after the arrow disappeared from the image, even though no arrow was seen when it was rewarded and no one told the system that the arrow direction is important to get the reward. Generalization to some new arrow patterns and associative memory function also can be seen to some extent.
Keywords :
learning (artificial intelligence); recurrent neural nets; arrow direction; delayed rewards; pattern meaning discovery; recurrent neural network; reinforcement learning; visual signals; Cameras; Green products; Learning; Neurons; Recurrent neural networks; Training; Visualization;
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-4244-9635-8
DOI :
10.1109/IJCNN.2011.6033394