Suggestion of probabilistic reward-independent knowledge for dynamic environment in reinforcement learning

Author

Shibuya, Nodoka ; Miyazaki, Yoshiki ; Kurashige, Kentarou

Author_Institution

Dept. of Inf. & Electron., Muroran Inst. of Technol., Muroran, Japan

fYear

2011

fDate

6-9 Nov. 2011

Firstpage

140

Lastpage

145

Abstract

Recently, reinforcement learning attracts attention as the learning technique that is often used on actual robot. As one of problems of reinforcement learning, it is difficult for reinforcement learning to cope with changing purpose, because reinforcement learning depend on reward. Until now, we suggested that we learned to use information does not depend on reward for solving the problem. This information is environmental transition. We defined this information as “Reward-Independent Knowledge (RIK)”. A robot gets RIK and predicts route from initial state to purpose state by using RIK. Reinforcement learning can cope with changing purpose by using RIK. However, it is difficult for RIK to cope with dynamic environment, because RIK is one to one correspondence between state-action pair and next state. Therefore, we suggest that RIK has multiple next state and probability of each possible next state. In this paper, we perform an experiment by simulation. We show that suggested knowledge copes with changing purpose and dynamic environment. In this experiment, we adopt a maze problem which a goal change and changing structure of maze. By this, we will show that suggested knowledge can cope with changing purpose and dynamic environment.

Keywords

learning (artificial intelligence); robots; actual robot; dynamic environment; environmental transition; learning technique; maze problem; probabilistic reward-independent knowledge; reinforcement learning; Equations; Humans; Learning; Mathematical model; Probabilistic logic; Probability; Robots;

fLanguage

English

Publisher

ieee

Conference_Titel

Micro-NanoMechatronics and Human Science (MHS), 2011 International Symposium on

Conference_Location

Nagoya

ISSN

Pending

Print_ISBN

978-1-4577-1360-6

Type

conf

DOI

10.1109/MHS.2011.6102175

Filename

6102175