DocumentCode :
2404375
Title :
SG-WRAP: a schema-guided wrapper generator
Author :
Meng, Xiaofeng ; Lu, Hongjun ; Wang, Haiyan ; Gu, Mingzhe
Author_Institution :
Inf. Sch., Renmin Univ. of China, Beijing, China
fYear :
2002
fDate :
2002
Firstpage :
331
Lastpage :
332
Abstract :
Although wrapper generation work has been reported in the literature, there seem no standard ways to evaluate the performance of such systems. We conducted a series of experiments to evaluate the usability, correctness and efficiency of SG-WRAP. The usability tests selected a number of users to use the system. The results indicated that, with minimal introduction of the system, DTD definition and structure of HTML pages, even naive users could quickly generate wrappers without much difficulty. For correctness, we adapted the precision and recall metrics in information retrieval to data extraction. The results show that, with the refining process, the system can generate wrappers with very high accuracy. Finally, the efficiency tests indicated that the wrapper generation process is fast enough even with large size Web pages
Keywords :
Internet; hypermedia markup languages; user interfaces; HTML document; HTML pages; Internet; Rule Refiner; SG-WRAP; Web wrapper technology; World Wide Web; data type descriptors; rule generator; schema acquirer; schema-guided wrapper generator; user interaction; Councils; Data mining; Data preprocessing; HTML; Internet; Spatial databases; Uniform resource locators; User interfaces; Web pages; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location :
San Jose, CA
ISSN :
1063-6382
Print_ISBN :
0-7695-1531-2
Type :
conf
DOI :
10.1109/ICDE.2002.994743
Filename :
994743
Link To Document :
بازگشت