Title :
Low latency policy iteration via parallel processing and randomization
Author :
Neal Master;Nicholas Bambos
Author_Institution :
Department of Electrical Engineering, Stanford University, CA, 94305, USA
Abstract :
Infinite horizon discounted cost Markov Decision Processes (MDPs) are ubiquitous in operations research, communications engineering, robotics, and a variety of other control applications. Consequently, there is significant value in being able to compute optimal policies more quickly. Previous work has focused on exploiting certain structural properties in particular problems with the aim of reducing algorithmic complexity. In this paper, we take a complementary approach in which we do not assume specialized structure, but instead consider how parallel computing can be used to reduce computational latency. We propose a randomized and parallelized algorithm which iterates in policy space. We demonstrate numerically and prove analytically that an exponential increase in the number of available processing cores leads to a linear reduction in computational latency. While this exponential relationship may seem discouraging, our numerical example shows that we can achieve a marked reduction in latency even with a moderate number of processors.
Keywords :
"Program processors","Algorithm design and analysis","Computational modeling","Aerospace electronics","Complexity theory","Convergence","Parallel algorithms"
Conference_Titel :
Decision and Control (CDC), 2015 IEEE 54th Annual Conference on
DOI :
10.1109/CDC.2015.7402356