DocumentCode
3633438
Title
Information Extraction from Web Pages
Author
Róbert Novotny;Peter Vojtas;Duan Maruscak
Volume
3
fYear
2009
Firstpage
121
Lastpage
124
Abstract
We present a chain of techniques for extraction of object attribute data from web pages which contain either multiple object data or detailed data about a single object. We discover data regions containing multiple data records, which will be extracted with help of extraction ontology. Furthermore, we present an additional algorithm for detail-page extraction based on the comparison of two HTML subtrees.
Keywords
"Data mining","Web pages","Ontologies","Intelligent agent","HTML","Software engineering","Target tracking","Conferences","Computer science","Collaboration"
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT ´09. IEEE/WIC/ACM International Joint Conferences on
Print_ISBN
978-0-7695-3801-3
Type
conf
DOI
10.1109/WI-IAT.2009.245
Filename
5284942
Link To Document