DocumentCode :
154138
Title :
Kylix: A Sparse Allreduce for Commodity Clusters
Author :
Huasha Zhao ; Canny, John
Author_Institution :
Comput. Sci. Div., Univ. of California, Berkeley, Berkeley, CA, USA
fYear :
2014
fDate :
9-12 Sept. 2014
Firstpage :
273
Lastpage :
282
Abstract :
Allreduce is a basic building block for parallel computing. Our target here is "Big Data" processing on commodity clusters (mostly sparse power-law data). Allreduce can be used to synchronize models, to maintain distributed datasets, and to perform operations on distributed data such as sparse matrix multiply. We first review a key constraint on cluster communication, the minimum efficient packet size, which hampers the use of direct all-to-all protocols on large networks. Our allreduce network is a nested, heterogeneous-degree butterfly. We show that communication volume in lower layers is typically much less than the top layer, and total communication across all layers a small constant larger than the top layer, which is close to optimal. A chart of network communication volume across layers has a characteristic "Kylix" shape, which gives the method its name. For optimum performance, the butterfly degrees also decrease down the layers. Furthermore, to efficiently route sparse updates to the nodes that need them, the network must be nested. While the approach is amenable to various kinds of sparse data, almost all "Big Data" sets show power-law statistics, and from the properties of these, we derive methods for optimal network design. Finally, we present experiments showing with Kylix on Amazon EC2 and demonstrating significant improvements over existing systems such as PowerGraph and Hadoop.
Keywords :
Big Data; data mining; graph theory; learning (artificial intelligence); matrix multiplication; network theory (graphs); parallel processing; sparse matrices; synchronisation; tree data structures; Amazon EC2; Big Data processing; characteristic Kylix shape; cluster communication; commodity clusters; communication volume; direct all-to-all protocols; distributed datasets; distributed graph mining; low-layers; machine learning; minimum efficient packet size; model synchronization; nested-heterogeneous-degree butterfly; network communication volume; optimal network design; optimum performance; parallel computing; power-law statistics; sparse allreduce; sparse matrix multiply; sparse power-law data; sparse updates; top-layer; Benchmark testing; Big data; Fault tolerance; Machine learning algorithms; Sparse matrices; Topology; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2014 43rd International Conference on
Conference_Location :
Minneapolis MN
ISSN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2014.36
Filename :
6957236
Link To Document :
بازگشت