DocumentCode :
3600146
Title :
Reducing communication overhead in distributed learning by an order of magnitude (almost)
Author :
Oland, Anders ; Raj, Bhiksha
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
2015
Firstpage :
2219
Lastpage :
2223
Abstract :
Large-scale distributed learning plays an ever-more increasing role in modern computing. However, whether using a compute cluster with thousands of nodes, or a single multi-GPU machine, the most significant bottleneck is that of communication. In this work, we explore the effects of applying quantization and encoding to the parameters of distributed models. We show that, for a neural network, this can be done - without slowing down the convergence, or hurting the generalization of the model. In fact, in our experiments we were able to reduce the communication overhead by nearly an order of magnitude - while actually improving the generalization accuracy.
Keywords :
distributed processing; learning (artificial intelligence); neural nets; distributed model parameters; encoding; large-scale distributed learning; multi-GPU machine; neural network; Accuracy; Convergence; Encoding; Entropy; Heuristic algorithms; Quantization (signal); Training; Compression; Distributed Training; Neural Networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/ICASSP.2015.7178365
Filename :
7178365
Link To Document :
بازگشت