مرکز منطقه ای اطلاع رساني علوم و فناوري - Switching Q-learning in partially observable Markovian environments

DocumentCode :

1739788

Title :

Switching Q-learning in partially observable Markovian environments

Author :

Kamaya, Hiroyuki ; Lee, Haeyeon ; Abe, Kenichi

Author_Institution :

Dept. Electr. Eng., Hachinohe Nat. Coll. of Technol., Japan

Volume :

fYear :

2000

fDate :

2000

Firstpage :

1062

Abstract :

Recent research on hidden-state reinforcement learning (RL) problems has been concentrated in overcoming partial observability by using memory to estimate states. Switching Q-learning (SQ-learning) is a novel memoryless approach for RL in partially observable environments. The basic idea of SQ-learning is that “non-Markovian” tasks can be automatically decomposed into subtasks solvable by memoryless policies, without any other information leading to “good” subgoals. To deal with such decomposition, SQ-learning employs ordered sequences of Q-modules in which each module discovers a local control policy. Furthermore, a hierarchical structure learning automaton is used which finds appropriate subgoal sequences. We apply SQ-learning to three partially observable maze problems. The results of extensive simulations demonstrate that SQ-learning has the ability to quickly learn optimal or near-optimal policies without huge computational burden

Keywords :

hierarchical systems; learning (artificial intelligence); learning automata; learning systems; memoryless systems; SQ-learning; hierarchical structure learning automaton; memoryless system; partially observable environments; reinforcement learning; Automatic control; Autonomous agents; Communication switching; Communications technology; Educational institutions; Embedded computing; Learning; Observability; Service robots; State estimation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Robots and Systems, 2000. (IROS 2000). Proceedings. 2000 IEEE/RSJ International Conference on

Conference_Location :

Takamatsu

Print_ISBN :

0-7803-6348-5

Type :

conf

DOI :

10.1109/IROS.2000.893160

Filename :

893160

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1739788