DocumentCode :
2563450
Title :
Research on Algorithm of Web Classification Based on EP and FFSS
Author :
Wang, WeiPing ; Wang, Zufeng
fYear :
2007
fDate :
15-19 Dec. 2007
Firstpage :
162
Lastpage :
166
Abstract :
In this paper, we present a new algorithm of web classifi- cation by combining extended pages (EP) and fair feature- subset selection (FFSS). As the importance of hyperlink, we extend web pages by anchor text. In extended pages, the proportion of the useful feature increases, so we can im- prove the solution of the web classification. In view of using the structure of the web, we get extended pages by append- ing the sentence or the paragraph including anchor text to the original pages. Fair feature-subset selection not only gives fair treatment to each category but also has ability to identify useful features, including both positive and negative features, so it can address the issue of high dimensionality of vector space. Experiments show that the new algorithm enhances the precision and recall of the traditional method.
Keywords :
Classification algorithms; Computational intelligence; Feature extraction; Information management; Information security; Pattern recognition; Space technology; Support vector machine classification; Support vector machines; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Security, 2007 International Conference on
Conference_Location :
Harbin
Print_ISBN :
0-7695-3072-9
Electronic_ISBN :
978-0-7695-3072-7
Type :
conf
DOI :
10.1109/CIS.2007.152
Filename :
4415323
Link To Document :
بازگشت