مرکز منطقه ای اطلاع رساني علوم و فناوري - Exploring the relationship of reward and punishment in reinforcement learning

DocumentCode :

3269669

Title :

Exploring the relationship of reward and punishment in reinforcement learning

Author :

Lowe, Robert ; Ziemke, Tom

Author_Institution :

Interaction Lab., Univ. of Skovde, Skovde, Sweden

fYear :

2013

fDate :

16-19 April 2013

Firstpage :

140

Lastpage :

147

Abstract :

We present a reinforcement learning algorithm based on Dyna-Sarsa that utilizes separate representations of reward and punishment when guiding state-action value learning and action selection. The adoption of policy meta-learning optimized by a genetic algorithm is explored and results in the context of a two-armed bandit goal-navigation task in a simple grid world are presented. The findings argue for an important role for a genetic algorithm approach for constructing the foundations of autonomous reinforcement learning agents.

Keywords :

genetic algorithms; learning (artificial intelligence); Dyna-Sarsa algorithm; action meta-learning functions; action selection; autonomous reinforcement learning agents; genetic algorithm approach; grid world; meta-learning policy optimization; punishment; reinforcement learning algorithm; reward; state-action value learning; two-armed bandit goal-navigation task; Context; Cost accounting; Genetic algorithms; Learning (artificial intelligence); Navigation; Optimization; Planning; Genetic Algorithm; Punishment; Reinforcement Contingencies; Reward; SARSA; TD learning; Value;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on

Conference_Location :

Singapore

ISSN :

2325-1824

Type :

conf

DOI :

10.1109/ADPRL.2013.6615000

Filename :

6615000

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3269669