DocumentCode :
586572
Title :
Scaling life-long off-policy learning
Author :
White, A. ; Modayil, J. ; Sutton, Richard S.
Author_Institution :
Dept. of Comput. Sci., Univ. of Alberta, Edmonton, AB, Canada
fYear :
2012
fDate :
7-9 Nov. 2012
Firstpage :
1
Lastpage :
6
Abstract :
In this paper we pursue an approach to scaling life-long learning using parallel off-policy reinforcement learning algorithms. In life-long learning a robot continually learns from a life-time of experience, slowly acquiring and applying skills and knowledge to new situations. Many of the benefits of life-long learning are a results of scaling the amount of training data, processed by the robot, to long sensorimotor streams. Another dimension of scaling can be added by allowing off-policy sampling from the unending stream of sensorimotor data generated by a long-lived robot. Recent algorithmic developments have made it possible to apply off-policy algorithms to life-long learning, in a sound way, for the first time. We assess the scalability of these off-policy algorithms on a physical robot. We show that hundreds of accurate multi-step predictions can be learned about several policies in parallel and in realtime. We present the first online measures of off-policy learning progress. Finally we demonstrate that our robot, using the new off-policy measures, can learn 8000 predictions about 300 distinct policies, a substantial increase in scale compared to previous simulated and robotic life-long learning systems.
Keywords :
control engineering computing; intelligent robots; learning (artificial intelligence); parallel processing; sampling methods; algorithm scalability; life-long learning scaling; life-time experience; multistep prediction; off-policy sampling; parallel off-policy reinforcement learning algorithm; physical robot; scaling dimension; sensorimotor data streams; skill acquisition; Approximation algorithms; Computer architecture; Function approximation; Prediction algorithms; Robot sensing systems; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Development and Learning and Epigenetic Robotics (ICDL), 2012 IEEE International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-4964-2
Electronic_ISBN :
978-1-4673-4963-5
Type :
conf
DOI :
10.1109/DevLrn.2012.6400860
Filename :
6400860
Link To Document :
بازگشت