DocumentCode
2028767
Title
Emergence of flexible prediction-based discrete decision making and continuous motion generation through actor-Q-learning
Author
Shibata, Kenji ; Goto, Keisuke
Author_Institution
Dept. of Electr. & Electron. Eng., Oita Univ., Oita, Japan
fYear
2013
fDate
18-22 Aug. 2013
Firstpage
1
Lastpage
6
Abstract
In this paper, the authors first point the importance of three factors for filling the gap between humans and robots in the flexibility in the real world. Those are (1)parallel processing, (2)emergence through learning and solving “what” problems, and (3)abstraction and generalization on the abstract space. To explore the possibility of human-like flexibility in robots, a prediction-required task in which an agent (robot) gets a reward by capturing a moving target that sometimes becomes invisible was learned by reinforcement learning using a recurrent neural network. Even though the agent did not know in advance that “prediction is required” or “what information should be predicted”, appropriate discrete decision making, in which `capture´ or `move´ was chosen, and also continuous motion generation in two-dimensional space, could be acquired. Furthermore, in this task, the target sometimes changed its moving direction randomly when it became visible again from invisible state. Then the agent could change its moving direction promptly and appropriately without introducing any special architecture or technique. Such emergent property is what general parallel processing systems such as Subsumption architecture do not have, and the authors believe it is a key to solve the “Frame Problem” fundamentally.
Keywords
continuous systems; discrete systems; generalisation (artificial intelligence); learning systems; motion control; neurocontrollers; predictive control; problem solving; recurrent neural nets; robots; 2D space continuous motion generation; abstract space; abstraction; actor-Q-learning; agent moving direction changing; flexible prediction-based discrete decision making; frame problem; generalization; human-like flexibility; parallel processing; recurrent neural network; reinforcement learning; subsumption architecture; what problem solving; Neurons; Recurrent neural networks; Robot sensing systems; Timing; Training; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on
Conference_Location
Osaka
Type
conf
DOI
10.1109/DevLrn.2013.6652559
Filename
6652559
Link To Document