DocumentCode
2257515
Title
Integration of Data Warehouse and Unstructured Business Documents
Author
Alqarni, Ahmad Abdullah ; Pardede, Eric
Author_Institution
Dept. of Comput. Sci. & Comput. Eng., La Trobe Univ., Melbourne, VIC, Australia
fYear
2012
fDate
26-28 Sept. 2012
Firstpage
32
Lastpage
37
Abstract
The profusion of unstructured data forced organizations to manage and take advantage of such data especially in the decision making process. The feasibility of integrating or mapping unstructured data to a data warehouse is becoming significant to bridge this gap and take the full potential of these data. In this paper, we propose a multi-layer schema for mapping structured data stored in a data warehouse and unstructured data in business-related documents. The multi-layer schema facilitates the mapping between the two different data. Linguistically correlated data is identified using Word Net to enable the integration between both data sources. We also propose a generic XML schema for business-related unstructured documents to assist the mapping. The use Word Net to identify the matching result is promising in the absence of schema-instance and without the need to domain specific knowledge.
Keywords
XML; data integration; data warehouses; decision making; WordNet; business-related unstructured documents; data sources; data warehouse integration; decision making process; generic XML schema; linguistic correlated data; multilayer schema; unstructured data forced organizations; unstructured data mapping; Data mining; Data models; Data warehouses; Organizations; Semantics; XML; XML schema matching; data integeration; data warehouse; schema mapping; unstructured document;
fLanguage
English
Publisher
ieee
Conference_Titel
Network-Based Information Systems (NBiS), 2012 15th International Conference on
Conference_Location
Melbourne, VIC
Print_ISBN
978-1-4673-2331-4
Type
conf
DOI
10.1109/NBiS.2012.59
Filename
6354804
Link To Document