DocumentCode
519765
Title
An ontology-based Web information extraction approach
Author
Wei-Guo, Yi ; Ling-Wei, Yan ; Ya-Qing, Liu ; Zhi, Liu
Author_Institution
Software Inst., Dalian Jiaotong Univ., Dalian, China
Volume
1
fYear
2010
fDate
21-24 May 2010
Abstract
An approach supervised by ontology is proposed for Web information extraction after analyzing two types of methods based on wrapper and concept model. Using concepts and taxonomy relation between concepts provided by ontology, this method can locate the wanted information blocks in Web page quickly by judging if adjacent sub-trees which are included in HTML Tree are isomorphic. Furthermore, combining text´s data-modes the method can filter information which are irrelevant to the wanted information and achieve higher accuracy of information extraction.
Keywords
Internet; hypermedia markup languages; information filters; information retrieval; ontologies (artificial intelligence); trees (mathematics); HTML Tree; Web information extraction approach; Web page; concept model; information filter; ontology; subtrees; wrapper model; Data mining; HTML; Information analysis; Information science; Mathematical model; Mathematics; Ontologies; Physics; Taxonomy; Web pages; Information Extraction; Ontology; Wrapper;
fLanguage
English
Publisher
ieee
Conference_Titel
Future Computer and Communication (ICFCC), 2010 2nd International Conference on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-5821-9
Type
conf
DOI
10.1109/ICFCC.2010.5497820
Filename
5497820
Link To Document