Adaptive action selection using utility-based reinforcement learning

Author

Chen, Kunrong ; Lin, Fen ; Tan, Qing ; Shi, Zhongzhi

Author_Institution

Key Lab. of Intell. Inf. Process., Chinese Acad. of Sci., Beijing, China

fYear

2009

fDate

17-19 Aug. 2009

Firstpage

67

Lastpage

72

Abstract

A basic problem of intelligent systems is choosing adaptive action to perform in a non-stationary environment. Due to the combinatorial complexity of actions, agent cannot possibly consider every option available to it at every instant in time. It needs to find good policies that dictate optimum actions to perform in each situation. This paper proposes an algorithm, called UQ-learning, to better solve action selection problem by using reinforcement learning and utility function. Reinforcement learning can provide the information of environment and utility function is used to balance exploration-exploitation dilemma. We implement our method with maze navigation tasks in a non-stationary environment. The results of simulated experiments show that utility-based reinforcement learning approach is more effective and efficient compared with Q-learning and recency-based exploration.

Keywords

Markov processes; combinatorial mathematics; computational complexity; decision theory; learning (artificial intelligence); multi-agent systems; Markov decision process; UQ-learning algorithm; adaptive action selection; balance exploration-exploitation dilemma; combinatorial complexity; intelligent system; maze navigation task; multiagent system; nonstationary environment; recency-based exploration; utility function; utility-based reinforcement learning algorithm; Adaptive systems; Computers; Environmental management; Information processing; Intelligent agent; Intelligent systems; Learning; Management training; Navigation; Simulated annealing;

fLanguage

English

Publisher

ieee

Conference_Titel

Granular Computing, 2009, GRC '09. IEEE International Conference on

Conference_Location

Nanchang

Print_ISBN

978-1-4244-4830-2

Type

conf

DOI

10.1109/GRC.2009.5255163

Filename

5255163