DocumentCode :
169926
Title :
The Deep Data Warehouse: Link-Based Integration and Enrichment of Warehouse Data and Unstructured Content
Author :
Groger, Christoph ; Schwarz, Holger ; Mitschang, Bernhard
Author_Institution :
Inst. of Parallel & Distrib. Syst., Univ. of Stuttgart, Stuttgart, Germany
fYear :
2014
fDate :
1-5 Sept. 2014
Firstpage :
210
Lastpage :
217
Abstract :
Data warehouses are at the core of enterprise IT and enable the efficient storage and analysis of structured data. Besides, unstructured content, e.g., emails and documents, constitutes more than half of the entire enterprise data and contains a lot of implicit knowledge about warehouse entities. Thus, holistic ana-lytics require the integration of structured warehouse data and unstructured content to generate novel insights. These insights can also be used to enrich the integrated data and to create a new basis for further analytics. Existing integration approaches only support a limited range of analytical applications and require the costly adaptation of the warehouse schema. In this paper, we present the Deep Data Warehouse (DeepDWH), a novel type of data warehouse based on the flexible integration and enrichment of warehouse data and unstructured content, addressing the variety challenge of Big Data. It relies on information-rich in-stance-level links between warehouse elements and content items, which are represented in a graph-oriented structure. Neither adaptations of the existing warehouse nor the design of an overall federated schema are required. We design a conceptual linking model and develop a logical schema for links based on a property graph. As a proof of concept, we present a prototypical imple-mentation of the DeepDWH including a link store based on a graph database.
Keywords :
data analysis; data integration; data warehouses; graphs; storage management; DeepDWH; conceptual linking model; content items; deep data warehouse; enterprise IT; enterprise data; flexible warehouse data enrichment; flexible warehouse data integration; graph database; graph-oriented structure; information-rich instance-level links; link-based integration; logical schema; property graph; structured data analysis; structured data storage; unstructured content; warehouse elements; warehouse entities; Analytical models; Data mining; Data models; Data warehouses; Joining processes; Unified modeling language; data integration; data warehouse; graph data; link-based integration; unstructured data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Enterprise Distributed Object Computing Conference (EDOC), 2014 IEEE 18th International
Conference_Location :
Ulm
ISSN :
1541-7719
Type :
conf
DOI :
10.1109/EDOC.2014.36
Filename :
6972069
Link To Document :
بازگشت