DocumentCode :
2470867
Title :
Multi-objective reinforcement learning method for acquiring all pareto optimal policies simultaneously
Author :
Mukai, Yusuke ; Kuroe, Yasuaki ; Iima, Hitoshi
Author_Institution :
Dept. of Adv. Fibro Sci., Kyoto Inst. of Technol., Kyoto, Japan
fYear :
2012
fDate :
14-17 Oct. 2012
Firstpage :
1917
Lastpage :
1923
Abstract :
This paper studies multi-objective reinforcement learning problems in which an agent gains multiple rewards. In ordinary multi-objective reinforcement learning methods, only a single Pareto optimal policy is acquired by the scalarizing method which uses the weighted sum of the reward vector, and therefore different Pareto optimal policies are acquired by changing the weight vector and by performing the methods again. On the other hand, a method in which all Pareto optimal policies are acquired simultaneously is proposed for problems whose environment model is known. By using the idea of the method, we propose a method that acquires all Pareto optimal policies simultaneously for the multi-objective reinforcement learning problems whose environment model is unknown. Furthermore, we show theoretically and experimentally that the proposed method can find the Pareto optimal policies.
Keywords :
Pareto optimisation; learning (artificial intelligence); multi-agent systems; vectors; Pareto optimal policies; agent; environment model; multiobjective reinforcement learning method; reward vector; scalarizing method; weight vector; Equations; Information science; Learning; Markov processes; Mathematical model; Pareto optimization; Vectors; Multi-objective problem; Pareto optimal policy; Reinforcement learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-1713-9
Electronic_ISBN :
978-1-4673-1712-2
Type :
conf
DOI :
10.1109/ICSMC.2012.6378018
Filename :
6378018
Link To Document :
بازگشت