DocumentCode
3234060
Title
An efficient wrapper for Web data extraction and its application
Author
Zhang, Suzhi ; Shi, Peizhong
Author_Institution
Coll. of Comput. & Commun. Eng., Zhengzhou Univ. of Light Ind., Zhengzhou, China
fYear
2009
fDate
25-28 July 2009
Firstpage
1245
Lastpage
1250
Abstract
Web Wrapper extracts the data from the given Web sources according to the corresponding extraction rules of them. Its´ design is a key technology for Web information extraction and integration. This paper describes the design and implementation of a kind of the Web wrapper which based on pre-defined schema. Then it validates the data extraction from the new books information Web pages of some publishing companies and analyses the extraction results with this kind of Web Wrapper. We find it can accurately extract the data from the Web source. So we can conclude that this kind of Web Wrapper which proposed in this paper is feasible, efficient and maintainable. It will be applied for Web data integration based on wrapper/mediator that we rely on to develop a Web application for book information integration and query system.
Keywords
Internet; data handling; query processing; Web data extraction; Web data integration; Web information extraction; Web information integration; Web sources; Web wrapper; books information Web pages; query system; Application software; Books; Computer science; Computer science education; Data engineering; Data mining; Displays; Educational institutions; HTML; Web pages; Web data integration; Wrapper; extraction rule; information extraction; new book information;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education, 2009. ICCSE '09. 4th International Conference on
Conference_Location
Nanning
Print_ISBN
978-1-4244-3520-3
Electronic_ISBN
978-1-4244-3521-0
Type
conf
DOI
10.1109/ICCSE.2009.5228403
Filename
5228403
Link To Document