DocumentCode
587314
Title
A modification of gradient policy in reinforcement learning procedure
Author
Abas, Marcel ; Skripcak, Tomas
Author_Institution
Inst. of Appl. Inf., Autom. & Math., Slovak Univ. of Technol. in Bratislava, Trnava, Slovakia
fYear
2012
fDate
26-28 Sept. 2012
Firstpage
1
Lastpage
2
Abstract
The gradient of a scalar function is frequently used in various areas of mathematics. In informatics it can be used, for example, in the process of learning procedure of many control systems. The key observation is that gradient, if it is a non-zero vector, is a vector in the direction of greatest rate of the scalar function. In this contribution we show a method how to determine the direction(s) even if the gradient is zero vector. We show that this can be done with the knowledge which students have it their stage of study.
Keywords
gradient methods; intelligent robots; learning (artificial intelligence); vectors; control system; gradient policy modification; informatics; mathematics; nonzero vector gradient; reinforcement learning procedure; robot learning; scalar function gradient; zero vector gradient; Automation; Control systems; Educational institutions; Informatics; Learning; Robots; Vectors; control system; direction of greatest rate; gradient policy; neuron networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Interactive Collaborative Learning (ICL), 2012 15th International Conference on
Conference_Location
Villach
Print_ISBN
978-1-4673-2425-0
Electronic_ISBN
978-1-4673-2426-7
Type
conf
DOI
10.1109/ICL.2012.6402200
Filename
6402200
Link To Document