مرکز منطقه ای اطلاع رساني علوم و فناوري - Continuous Action-Space Reinforcement Learning Methods Applied to the Minimum-Time Swing-Up of the Acrobot

DocumentCode :

3728260

Title :

Continuous Action-Space Reinforcement Learning Methods Applied to the Minimum-Time Swing-Up of the Acrobot

Author :

Barry D. Nichols

Author_Institution :

Sch. of Sci. &

fYear :

2015

Firstpage :

2084

Lastpage :

2089

Abstract :

Here I apply three reinforcement learning methods to the full, continuous action, swing-up acrobot control benchmark problem. These include two approaches from the literature: CACLA and NM-SARSA and a novel approach which I refer to as Nelder Mead-SARSA. Nelder Mead-SARSA, like NMSARSA, directly optimises the state-action value function for action selection, in order to allow continuous action reinforcement learning without a separate policy function. However, as it uses a derivative-free method it does not require the first or second partial derivatives of the value function. All three methods achieved good results in terms of swing-up times, comparable to previous approaches from the literature. Particularly Nelder Mead-SARSA, which performed the swing up in a shorter time than many approaches from the literature.

Keywords :

"Learning (artificial intelligence)","Training","Mathematical model","Switches","Reactive power","Newton method","Optimization"

Publisher :

ieee

Conference_Titel :

Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on

Type :

conf

DOI :

10.1109/SMC.2015.364

Filename :

7379496

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3728260