DocumentCode :
3579887
Title :
Content Information Extraction of Theme Web Pages Based on Tag Information
Author :
Jie Wang ; Jian Wu ; Yafeng Zhang ; Guowan He
Author_Institution :
Sch. of Manage., Capital Normal Univ., Beijing, China
Volume :
1
fYear :
2014
Firstpage :
501
Lastpage :
504
Abstract :
In order to extract the content information of Theme Web Pages more accurately, this paper proposes a self-learning method based on the tag information by calculating the information quantity of various tag indicators. This method predefines several tag information indexes and coefficients index to calculate a variety of tag information quantity of the web pages in turn, and then the candidate content of Web pages is in the tag with the most information quantity. To improve the versatility of the method, we add the adaptive and adjustable coefficient weight in calculation formulas of tag information quantity. With the increasing of data be processed, tag collections, index value and the information quantity results are added into the learning database to adjust the weight of coefficient factor. Experimental results show that the accuracy of this extraction method with adaptive and adjustable coefficient weights can reach more than 99 percent recall rate. Also, this method does not depend on the specific structure and style of the web page and has good versatility.
Keywords :
Internet; information retrieval; learning (artificial intelligence); coefficients index; content information extraction; information indexes; information quantity; self-learning method; tag collections; tag indicators; tag information; theme Web pages; Accuracy; Data mining; Feature extraction; Indexes; Web pages; Content Information Extraction; DOM Tree; Tag information quantity; Theme Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Design (ISCID), 2014 Seventh International Symposium on
Print_ISBN :
978-1-4799-7004-9
Type :
conf
DOI :
10.1109/ISCID.2014.257
Filename :
7064243
Link To Document :
بازگشت