DocumentCode :
3501568
Title :
MSSG: A Framework for Massive-Scale Semantic Graphs
Author :
Hartley, Timothy D R ; Catalyurek, Umit ; Özgüner, Füsun ; Yoo, Andy ; Kohn, Scott ; Henderson, Keith
Author_Institution :
Dept. of Electr. & Comput. Eng., The Ohio State Univ.
fYear :
2006
fDate :
25-28 Sept. 2006
Firstpage :
1
Lastpage :
10
Abstract :
This paper presents a middleware framework for storing, accessing and analyzing massive-scale semantic graphs. The framework, MSSG, targets scale-free semantic graphs with O(1012) (trillion) vertices and edges. Here, we present the overall architectural design of the framework, as well as a prototype implementation for cluster architectures. The sheer size of these massive-scale semantic graphs prohibits storing the entire graph in memory even on medium- to large-scale parallel architectures. We therefore propose a new graph database, grDB, for the efficient storage and retrieval of large scale-free semantic graphs on secondary storage. This new database supports the efficient and scalable execution of parallel out-of-core graph algorithms which are essential for analyzing semantic graphs of massive size. We have also developed a parallel out-of-core breadth-first search algorithm for performance study. To the best of our knowledge, it is the first of such algorithms reported in the literature. Experimental evaluations on large real-world semantic graphs show that the MSSG framework scales well, and grDB outperforms widely used open-source out-of-core databases, such as BerkeleyDB and MySQL, in the storage and retrieval of scale-free graphs
Keywords :
SQL; graph theory; middleware; parallel architectures; tree searching; BerkeleyDB; MySQL; architectural design; cluster architectures; graph database; massive-scale semantic graphs; middleware framework; open-source out-of-core databases; parallel architectures; parallel out-of-core breadth-first search algorithm; parallel out-of-core graph algorithms; prototype implementation; scale-free semantic graphs; Algorithm design and analysis; Computer networks; Information retrieval; Large-scale systems; Middleware; Open source software; Parallel architectures; Proteins; Prototypes; Relational databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2006 IEEE International Conference on
Conference_Location :
Barcelona
ISSN :
1552-5244
Print_ISBN :
1-4244-0327-8
Electronic_ISBN :
1552-5244
Type :
conf
DOI :
10.1109/CLUSTR.2006.311857
Filename :
4100363
Link To Document :
بازگشت