مرکز منطقه ای اطلاع رساني علوم و فناوري - Performance of temporal differences and reinforcement learning in the cart-pole experiment

DocumentCode :

2972349

Title :

Performance of temporal differences and reinforcement learning in the cart-pole experiment

Author :

Geva, Shlomo ; Sitte, Joaquin

Author_Institution :

Fac. of Inf. Technol., Queensland Univ. of Technol., Brisbane, Qld., Australia

Volume :

fYear :

1993

fDate :

25-29 Oct. 1993

Firstpage :

2835

Abstract :

A comparison of the many proposed schemes for learning control tasks requires a standardised characterisation of their performance. Such characterisations are not available for most of the methods used in connection with neural networks. We undertook to characterise the performance of Barto´s et al. (1983) adaptive heuristic critic (AHC) method on the cart-pole balancing problem. We present criteria for training and control performance and use them to compare AHC controllers with the best PD controllers. It turns out that the learning dynamics of the AHC method for the cart-pole is apparently chaotic making performance assessment and parameter optimisation computationally very laborious.

Keywords :

adaptive control; intelligent control; learning (artificial intelligence); neurocontrollers; performance evaluation; robots; adaptive heuristic critic method; cart-pole balancing; intelligent control; learning control; neural networks; reinforcement learning; temporal differences; Australia; Bang-bang control; Chaos; Information technology; Learning; Neural networks; Optimization methods; PD control; Quantization; State-space methods;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 1993. IJCNN '93-Nagoya. Proceedings of 1993 International Joint Conference on

Print_ISBN :

0-7803-1421-2

Type :

conf

DOI :

10.1109/IJCNN.1993.714314

Filename :

714314

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2972349