DocumentCode
2867979
Title
A Web Page Classification Algorithm Based on Link Information
Author
Xu, Zhaohui ; Yan, Fuliang ; Qin, Jie ; Zhu, Haifeng
Author_Institution
Sch. of Comput., Wuhan Univ. of Technol., Wuhan, China
fYear
2011
fDate
14-17 Oct. 2011
Firstpage
82
Lastpage
86
Abstract
Effective classification of web pages can improve the quality of information retrieval. The traditional classification algorithms are basically based on the analysis of Web content, but the content of the web page is complicated, filled with a large number of false, erroneous information, has seriously affected the accuracy of the classification of network information. To solve this problem, this paper presents a web page classification algorithm, Link Information Categorization(LIC). Based on the K nearest neighbor method, it combines information on the website features, to implement the Web page link to information classification. Experiments show that the algorithm can get higher efficiency and accuracy on the Web page classification.
Keywords
Web sites; information retrieval; pattern classification; K nearest neighbor method; Web content analysis; Web page classification algorithm; Web page link; Web site features; information retrieval; link information categorization; network information classification; Accuracy; Algorithm design and analysis; Classification algorithms; Internet; Support vector machines; Text categorization; Web pages; Link Information; Link Information Categorization; Web Page Classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing and Applications to Business, Engineering and Science (DCABES), 2011 Tenth International Symposium on
Conference_Location
Wuxi
Print_ISBN
978-1-4577-0327-0
Type
conf
DOI
10.1109/DCABES.2011.19
Filename
6119026
Link To Document