DocumentCode :
656153
Title :
Fast Approximate Subgraph Counting and Enumeration
Author :
Slota, George M. ; Madduri, Kamesh
Author_Institution :
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
fYear :
2013
fDate :
1-4 Oct. 2013
Firstpage :
210
Lastpage :
219
Abstract :
We present a new shared-memory parallel algorithm and implementation called FASCIA for the problems of approximate sub graph counting and sub graph enumeration. The problem of sub graph counting refers to determining the frequency of occurrence of a given sub graph (or template) within a large network. This is a key graph analytic with applications in various domains. In bioinformatics, sub graph counting is used to detect and characterize local structure (motifs) in protein interaction networks. Exhaustive enumeration and exact counting is extremely compute-intensive, with running time growing exponentially with the number of vertices in the template. In this work, we apply the color coding technique to determine approximate counts of non-induced occurrences of the sub graph in the original network. Color coding gives a fixed-parameter algorithm for this problem, using a dynamic programming-based counting approach. Our new contributions are a multilevel shared-memory parallelization of the counting scheme and several optimizations to reduce the memory footprint. We show that approximate counts can be obtained for templates with up to 12 vertices, on networks with up to millions of vertices and edges. Prior work on this problem has only considered out-of-core parallelization on distributed platforms. With our new counting scheme, data layout optimizations, and multicore parallelism, we demonstrate a significant speedup over the current state-of-the-art for sub graph counting.
Keywords :
bioinformatics; dynamic programming; graph colouring; parallel algorithms; proteins; shared memory systems; FASCIA; bioinformatics; color coding technique; data layout optimizations; distributed platforms; dynamic programming-based counting approach; exact counting; exhaustive enumeration; fast approximate subgraph counting and enumeration; fixed-parameter algorithm; local structure characterization; local structure detection; multicore parallelism; multilevel shared-memory parallelization; out-of-core parallelization; protein interaction networks; shared-memory parallel algorithm; Color; Dynamic programming; Encoding; Heuristic algorithms; Image color analysis; Indexes; Partitioning algorithms; color coding; motifs; subgraph counting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2013 42nd International Conference on
Conference_Location :
Lyon
ISSN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2013.30
Filename :
6687354
Link To Document :
بازگشت