DocumentCode :
2113365
Title :
A Semantic DOM Approach for Webpage Information Extraction
Author :
Fei, Yulian ; Luo, Zongwei ; Xu, Yun ; Zhang, Winston
Author_Institution :
Comput. Sci. & Inf. Eng. Inst., Zhejiang Gongshang Univ., Hangzhou, China
fYear :
2009
fDate :
20-22 Sept. 2009
Firstpage :
1
Lastpage :
5
Abstract :
With the development of electronic technology and e-commerce, technology for Web pages has attracted a lot of research efforts which becomes one of the hottest topics recently. This paper has proposed a semantic DOM (SDOM) approach for information extraction of e-commerce Web pages. With the combination of content and structure information, the precision and recall can achieve a good result which is shown in our experiments on listpage and tablepage data sets.
Keywords :
Web sites; electronic commerce; Web page information extraction; document object model; e-commerce; listpage dataset; semantic DOM approach; tablepage data set; Computer science; Data mining; Facebook; HTML; Information services; Internet; Machine learning; Tree data structures; Web pages; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Management and Service Science, 2009. MASS '09. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-4638-4
Electronic_ISBN :
978-1-4244-4639-1
Type :
conf
DOI :
10.1109/ICMSS.2009.5302541
Filename :
5302541
Link To Document :
بازگشت