مرکز منطقه ای اطلاع رساني علوم و فناوري - Decentralized learning for traffic signal control

DocumentCode :

705789

Title :

Decentralized learning for traffic signal control

Author :

Prabuchandran, K.J. ; Hemanth Kumar, A.N. ; Bhatnagar, Shalabh

Author_Institution :

Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore, India

fYear :

2015

fDate :

6-10 Jan. 2015

Firstpage :

Lastpage :

Abstract :

In this paper, we study the problem of obtaining the optimal order of the phase sequence [14] in a road network for efficiently managing the traffic flow. We model this problem as a Markov decision process (MDP). This problem is hard to solve when simultaneously considering all the junctions in the road network. So, we propose a decentralized multi-agent reinforcement learning (MARL) algorithm for solving this problem by considering each junction in the road network as a separate agent (controller). Each agent optimizes the order of the phase sequence using Q-learning with either ∈-greedy or UCB [3] based exploration strategies. The coordination between the junctions is achieved based on the cost feedback signal received from the neighbouring junctions. The learning algorithm for each agent updates the Q-factors using this feedback signal. We show through simulations over VISSIM that our algorithms perform significantly better than the standard fixed signal timing (FST), the saturation balancing (SAT) [14] and the round-robin multi-agent reinforcement learning algorithms [11] over two real road networks.

Keywords :

Markov processes; decentralised control; decision making; greedy algorithms; learning (artificial intelligence); learning systems; multi-agent systems; network theory (graphs); optimal control; road traffic control; E-greedy; FST; MARL algorithm; MDP; Markov decision process; Q-Iearning; Q-factors; SAT; VCB based exploration strategies; VISSIM; cost feedback signal; decentralized learning; decentralized multiagent reinforcement learning algorithm; learning algorithm; phase sequence; road network junctions; round-robin multiagent reinforcement learning algorithms; saturation balancing; standard fixed signal timing; traffic flow; traffic signal control; Approximation algorithms; Delays; Junctions; Q-factor; Roads; Sensors; Vehicles; Q-learning; UCB; VISSIM; multi-agent reinforcement learning; optimal phase sequence; traffic signal control;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Communication Systems and Networks (COMSNETS), 2015 7th International Conference on

Conference_Location :

Bangalore

Type :

conf

DOI :

10.1109/COMSNETS.2015.7098712

Filename :

7098712

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=705789