DocumentCode
2261131
Title
Dynamically Constructing a Global Schema for Web Entities
Author
Xu, Xiuxing ; Li, Qingzhong ; Dong, Yongquan ; Ding, Yanhui
Author_Institution
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
fYear
2010
fDate
20-22 Aug. 2010
Firstpage
127
Lastpage
131
Abstract
With the rapid development of the Internet, popular entities have more and more instances on the Web. It is observed that, on one hand, for the same Web entity, different Web entity instances often contain different attributes, and for the same attribute, different Web entity instances often use different labels; on the other, new Web entity instances which contain new attributes and labels are appearing on the Web. Therefore, it is difficult to dynamically construct a global schema for the Web entities of a given entity type, although the global schema is highly desired in Web entity instances detection, extraction and integration. In this paper, we propose a novel approach to dynamically construct a global schema for the Web entities of a given entity type. First, a SVM (support vector machine) classification model is built based on the Web entity instances which have been extracted from related Web pages. Then, based on this model, a global schema discovery approach is provided to dynamically construct the global schema for target entity type. Experimental results on the Chinese Web sites show that the approach is general and effective.
Keywords
Internet; Web sites; data mining; support vector machines; Information extraction; Information integration; Internet; SVM; Web entity; Web pages; global schema; Classification algorithms; Construction industry; Data mining; Support vector machines; Training; Web pages; Global Schema; SVM; Web Entities; Web Information Integration;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information Systems and Applications Conference (WISA), 2010 7th
Conference_Location
Hohhot
Print_ISBN
978-1-4244-8440-9
Type
conf
DOI
10.1109/WISA.2010.32
Filename
5581387
Link To Document