Title :
Scalable Analysis of Massive Graphs on a Parallel Data Flow System
Author_Institution :
Center for Appl. Sci. Comput., Lawrence Livermore Nat. Lab., Livermore, CA, USA
Abstract :
The feasibility of using dataflow systems for running complex graph queries is studied in this paper. A general query optimization framework for parallel dataflow systems is also proposed. The proposed methods are used to optimize a suite of benchmark queries, and their effectiveness is evaluated. The performance of the optimized queries is measured on an actual parallel dataflow machine using a large semantic graph and compared to that of equivalent SQL queries on a high-end parallel relational database system. The study has revealed that dataflow system can achieve significant performance improvement over state-of-art database systems and can be a viable and scalable alternative to run large complex graph queries.
Keywords :
SQL; data flow graphs; graphs; query processing; relational databases; equivalent SQL queries; high-end parallel relational database system; massive graphs; parallel data flow system; query optimization framework; scalable analysis; Computer architecture; Concurrent computing; Data flow computing; Flow graphs; Optimization methods; Parallel processing; Performance analysis; Query processing; Relational databases; Scientific computing;
Conference_Titel :
System Sciences (HICSS), 2010 43rd Hawaii International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4244-5509-6
Electronic_ISBN :
1530-1605
DOI :
10.1109/HICSS.2010.325