Title :
PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks
Author :
Slota, George M. ; Madduri, Kamesh ; Rajamanickam, Sivasankaran
Author_Institution :
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
Abstract :
We present PuLP, a parallel and memory-efficient graph partitioning method specifically designed to partition low-diameter networks with skewed degree distributions. Graph partitioning is an important Big Data problem because it impacts the execution time and energy efficiency of graph analytics on distributed-memory platforms. Partitioning determines the in-memory layout of a graph, which affects locality, intertask load balance, communication time, and overall memory utilization of graph analytics. A novel feature of our method PuLP (Partitioning using Label Propagation) is that it optimizes for multiple objective metrics simultaneously, while satisfying multiple partitioning constraints. Using our method, we are able to partition a web crawl with billions of edges on a single compute server in under a minute. For a collection of test graphs, we show that PuLP uses 8-39× less memory than state-of-the-art partitioners and is up to 14.5× faster, on average, than alternate approaches (with 16-way parallelism). We also achieve better partitioning quality results for the multi-objective scenario.
Keywords :
Big Data; graph theory; parallel processing; resource allocation; search engines; small-world networks; Big Data problem; PULP; Web crawl partitioning quality; communication time; distributed-memory platforms; energy efficiency; execution time; graph analytics; graph locality; in-memory graph layout; intertask load balancing; low-diameter network partitioning; multiple objective metrics optimization; multiple partitioning constraints; overall memory graph analytics utilization; parallel-memory-efficient graph partitioning method; parallelism; partitioning-using-label propagation; scalable multiobjective multiconstraint partitioning; skewed degree distributions; small-world networks; Clustering algorithms; Communities; Heuristic algorithms; Layout; Measurement; Memory management; Partitioning algorithms;
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/BigData.2014.7004265