Title of article :
AN ENSEMBLE FEATURE SELECTION METHOD TO DETECT WEB SPAM
Author/Authors :
danandeh oskouei, mahdieh islamic azad university, shabestar branch - department of computer, ايران , razavi, naser university of tabriz - department of electrical and computer engineering, ايران
From page :
99
To page :
113
Abstract :
Feature selection is an important issue in data mining, and it is used to reduce dimensions of features set. Web spam detection is one of research fields of data mining. With regard to increasing available information in virtual space and the need of users to search, the role of search engines and used algorithms are important in terms of ranking. Web spam is an illegal method to increase mendacious rank of internet pages by deceiving the algorithms of search engines, so it is essential to use an efficient method. Up to now, many methods have been proposed to face with web spam. An ensemble feature selection method has been proposed in this paper to detect web spam. Content features of standard dataset of WEBSPAM-UK2007 are used for evaluation. Bayes network classifier is used along with 70-30% training-testing spilt of dataset. The presented results show that Area Under the ROC Curve (AUC) of this method is higher than the other methods reported in this paper. Moreover, the best values of evaluation metrics in our proposed method are optimal in comparison to the other methods reported in this paper. In addition, it improves classification metrics in comparison to basic feature selection methods.
Keywords :
Ensemble feature selection , Web spam , Ranking , Machine learning
Journal title :
Asia-Pacific Journal Of Information Technology an‎d Multimedia
Journal title :
Asia-Pacific Journal Of Information Technology an‎d Multimedia
Record number :
2699080
Link To Document :
بازگشت