Title :
Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion
Author :
Barrios-Aranibar, Dennis ; Goncalves, Luiz M. G.
Author_Institution :
Fed. Univ. of Rio Grande do Norte, Natal
Abstract :
In this work authors extend the model of the reinforcement learning paradigm for multi-agent systems called "influence value reinforcement learning " (IVRL). In previous work an algorithm for repetitive games was proposed, and it outperformed traditional paradigms. Here, authors define an algorithm based on this paradigm for using when agents has to learn from delayed rewards, thus, an influence value reinforcement learning algorithm for two agents stochastic games. The IVRLparadigm is based on social interaction of people, specially in the fact that people communicate each other what they think about their actions and this opinion has some influence in the behavior of each other. A modified version of Q-learning algorithm using this paradigm was constructed. The so called TV Q-learning algorithm was implemented and compared with versions of Q-learning for independent learning and joint action learning. Our approach shows to have more probability to converge to an optimal equilibrium than IQ-learning and JAQ-learning algorithms, specially when exploration increases.
Keywords :
learning (artificial intelligence); multi-robot systems; stochastic games; IQ-learning algorithm; JAQ-learning algorithms; TV Q-learning algorithm; influence value reinforcement learning; multiagent systems; stochastic games; Automation; Collaborative work; Delay; Game theory; Hybrid intelligent systems; Learning; Multiagent systems; Nash equilibrium; Stochastic processes; Testing;
Conference_Titel :
Hybrid Intelligent Systems, 2007. HIS 2007. 7th International Conference on
Conference_Location :
Kaiserlautern
Print_ISBN :
978-0-7695-2946-2
DOI :
10.1109/HIS.2007.61