Emergence of flexible prediction-based discrete decision making and continuous motion generation through actor-Q-learning

Author

Shibata, Kenji ; Goto, Keisuke

Author_Institution

Dept. of Electr. & Electron. Eng., Oita Univ., Oita, Japan

fYear

2013

fDate

18-22 Aug. 2013

Firstpage

1

Lastpage

6

Abstract

In this paper, the authors first point the importance of three factors for filling the gap between humans and robots in the flexibility in the real world. Those are (1)parallel processing, (2)emergence through learning and solving “what” problems, and (3)abstraction and generalization on the abstract space. To explore the possibility of human-like flexibility in robots, a prediction-required task in which an agent (robot) gets a reward by capturing a moving target that sometimes becomes invisible was learned by reinforcement learning using a recurrent neural network. Even though the agent did not know in advance that “prediction is required” or “what information should be predicted”, appropriate discrete decision making, in which `capture´ or `move´ was chosen, and also continuous motion generation in two-dimensional space, could be acquired. Furthermore, in this task, the target sometimes changed its moving direction randomly when it became visible again from invisible state. Then the agent could change its moving direction promptly and appropriately without introducing any special architecture or technique. Such emergent property is what general parallel processing systems such as Subsumption architecture do not have, and the authors believe it is a key to solve the “Frame Problem” fundamentally.

Keywords

continuous systems; discrete systems; generalisation (artificial intelligence); learning systems; motion control; neurocontrollers; predictive control; problem solving; recurrent neural nets; robots; 2D space continuous motion generation; abstract space; abstraction; actor-Q-learning; agent moving direction changing; flexible prediction-based discrete decision making; frame problem; generalization; human-like flexibility; parallel processing; recurrent neural network; reinforcement learning; subsumption architecture; what problem solving; Neurons; Recurrent neural networks; Robot sensing systems; Timing; Training; Vectors;

fLanguage

English

Publisher

ieee

Conference_Titel

Development and Learning and Epigenetic Robotics (ICDL), 2013 IEEE Third Joint International Conference on

Conference_Location

Osaka

Type

conf

DOI

10.1109/DevLrn.2013.6652559

Filename

6652559