DocumentCode :
1633647
Title :
Consensus-based distributed optimization: Practical issues and applications in large-scale machine learning
Author :
Tsianos, Konstantinos I. ; Lawlor, S. ; Rabbat, Michael G.
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
fYear :
2012
Firstpage :
1543
Lastpage :
1550
Abstract :
This paper discusses practical consensus-based distributed optimization algorithms. In consensus-based optimization algorithms, nodes interleave local gradient descent steps with consensus iterations. Gradient steps drive the solution to a minimizer, while the consensus iterations synchronize the values so that all nodes converge to a network-wide optimum when the objective is convex and separable. The consensus update requires communication. If communication is synchronous and nodes wait to receive one message from each of their neighbors before updating then progress is limited by the slowest node. To be robust to failing or stalling nodes, asynchronous communications should be used. Asynchronous protocols using bi-directional communications cause deadlock, and so one-directional protocols are necessary. However, with one-directional asynchronous protocols it is no longer possible to guarantee the consensus matrix is doubly stochastic. At the same time it is essential that the coordination protocol achieve consensus on the average to avoid biasing the optimization objective. We report on experiments running Push-Sum Distributed Dual Averaging for convex optimization in a MPI cluster. The experiments illustrate the benefits of using asynchronous consensus-based distributed optimization when some nodes are unreliable and may fail or when messages experience time-varying delays.
Keywords :
convex programming; gradient methods; learning (artificial intelligence); message passing; protocols; MPI cluster; asynchronous communications; asynchronous protocols; bidirectional communications; consensus iterations; convex optimization; coordination protocol; failing nodes; large-scale machine learning; nodes interleave local gradient descent steps; practical consensus-based distributed optimization algorithms; push-sum distributed dual averaging; stalling nodes; time-varying delays; Convergence; Delays; Machine learning algorithms; Optimization; Peer-to-peer computing; Protocols; Stochastic processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on
Conference_Location :
Monticello, IL
Print_ISBN :
978-1-4673-4537-8
Type :
conf
DOI :
10.1109/Allerton.2012.6483403
Filename :
6483403
Link To Document :
بازگشت