Title :
Table2Graph: A Scalable Graph Construction from Relational Tables Using Map-Reduce
Author :
Sangkeun Lee ; Park, Byung H. ; Seung-Hwan Lim ; Shankar, Mallikarjun
Author_Institution :
Oak Ridge Nat. Lab., Oak Ridge, TN, USA
fDate :
March 30 2015-April 2 2015
Abstract :
Identifying correlations and relationships between entities within and across different data sets (or databases) is of great importance in many domains. The data warehouse-based integration, which has been most widely practiced, is found to be inadequate to achieve such a goal. Instead we explored an alternate solution that turns multiple disparate data sources into a single heterogeneous graph model so that matching between entities across different source data would be expedited by examining their linkages in the graph. We found, however, while a graph-based model provides outstanding capabilities for this purposes, construction of one such model from relational source databases were time consuming and primarily left to ad hoc proprietary scripts. This led us to develop a reconfigurable and reusable graph construction tool that is designed to work at scale. In this paper, we introduce Table2Graph, the graph construction tool based on Map-Reduce framework over Hadoop. We also discuss results from applying Table2Graph to integrate disparate healthcare databases.
Keywords :
data handling; data warehouses; graph theory; health care; parallel processing; relational databases; MapReduce; Table2Graph; data warehouse-based integration; graph construction tool; graph-based model; healthcare database; relational source database; Data conversion; Data models; Load modeling; Relational databases; Resource description framework; TV; Construction; ETL; Graph; Heterogeneous; Map-Reduce;
Conference_Titel :
Big Data Computing Service and Applications (BigDataService), 2015 IEEE First International Conference on
Conference_Location :
Redwood City, CA
DOI :
10.1109/BigDataService.2015.52