Title :
Data transformation for warehousing Web data
Author :
Zhu, Yan ; Bornhövd, Christof ; Buchmann, Alejandro P.
Author_Institution :
Dept. of Comput. Sci., Darmstadt Univ. of Technol., Germany
Abstract :
In order to analyze market trends and make reasonable business plans, a company´s local data is not sufficient. Decision-making must also be based on information from suppliers, partners and competitors. This external data can be obtained from the World Wide Web in many cases, but must be integrated with the company´s own data, e.g. in a data warehouse. To this end, Web data has to be mapped to the star schema of the warehouse. In this paper, we propose a semi-automatic approach to support this transformation process. Our approach is based on the use of a rooted labeled tree representation of Web data and the existing warehouse schema. Based on this common view, we can compare the source and target schemata to identify correspondences. We show how the correspondences guide the transformation to be accomplished automatically. We also explain the meaning of recursion and restructuring in mapping rules, which are the core of the transformation algorithm
Keywords :
data handling; data warehouses; information resources; marketing data processing; World Wide Web data warehousing; business plans; company data; competitor information; data integration; data mapping rules; data warehouse schema; decision-making; external data; market trends; partner information; recursion; restructuring; rooted labeled tree representation; schema correspondences; semi-automatic data transformation; star schema; supplier information; Algorithm design and analysis; Companies; Computer science; Data warehouses; Information analysis; Motion analysis; Performance analysis; Performance gain; Tree graphs; Warehousing;
Conference_Titel :
Advanced Issues of E-Commerce and Web-Based Information Systems, WECWIS 2001, Third International Workshop on.
Conference_Location :
San Juan, CA
Print_ISBN :
0-7695-1224-0
DOI :
10.1109/WECWIS.2001.933908