Title :
A modification of gradient policy in reinforcement learning procedure
Author :
Abas, Marcel ; Skripcak, Tomas
Author_Institution :
Inst. of Appl. Inf., Autom. & Math., Slovak Univ. of Technol. in Bratislava, Trnava, Slovakia
Abstract :
The gradient of a scalar function is frequently used in various areas of mathematics. In informatics it can be used, for example, in the process of learning procedure of many control systems. The key observation is that gradient, if it is a non-zero vector, is a vector in the direction of greatest rate of the scalar function. In this contribution we show a method how to determine the direction(s) even if the gradient is zero vector. We show that this can be done with the knowledge which students have it their stage of study.
Keywords :
gradient methods; intelligent robots; learning (artificial intelligence); vectors; control system; gradient policy modification; informatics; mathematics; nonzero vector gradient; reinforcement learning procedure; robot learning; scalar function gradient; zero vector gradient; Automation; Control systems; Educational institutions; Informatics; Learning; Robots; Vectors; control system; direction of greatest rate; gradient policy; neuron networks;
Conference_Titel :
Interactive Collaborative Learning (ICL), 2012 15th International Conference on
Conference_Location :
Villach
Print_ISBN :
978-1-4673-2425-0
Electronic_ISBN :
978-1-4673-2426-7
DOI :
10.1109/ICL.2012.6402200