DocumentCode
3243447
Title
Data transformation for warehousing Web data
Author
Zhu, Yan ; Bornhövd, Christof ; Buchmann, Alejandro P.
Author_Institution
Dept. of Comput. Sci., Darmstadt Univ. of Technol., Germany
fYear
2001
fDate
2001
Firstpage
74
Lastpage
85
Abstract
In order to analyze market trends and make reasonable business plans, a company´s local data is not sufficient. Decision-making must also be based on information from suppliers, partners and competitors. This external data can be obtained from the World Wide Web in many cases, but must be integrated with the company´s own data, e.g. in a data warehouse. To this end, Web data has to be mapped to the star schema of the warehouse. In this paper, we propose a semi-automatic approach to support this transformation process. Our approach is based on the use of a rooted labeled tree representation of Web data and the existing warehouse schema. Based on this common view, we can compare the source and target schemata to identify correspondences. We show how the correspondences guide the transformation to be accomplished automatically. We also explain the meaning of recursion and restructuring in mapping rules, which are the core of the transformation algorithm
Keywords
data handling; data warehouses; information resources; marketing data processing; World Wide Web data warehousing; business plans; company data; competitor information; data integration; data mapping rules; data warehouse schema; decision-making; external data; market trends; partner information; recursion; restructuring; rooted labeled tree representation; schema correspondences; semi-automatic data transformation; star schema; supplier information; Algorithm design and analysis; Companies; Computer science; Data warehouses; Information analysis; Motion analysis; Performance analysis; Performance gain; Tree graphs; Warehousing;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Issues of E-Commerce and Web-Based Information Systems, WECWIS 2001, Third International Workshop on.
Conference_Location
San Juan, CA
Print_ISBN
0-7695-1224-0
Type
conf
DOI
10.1109/WECWIS.2001.933908
Filename
933908
Link To Document