DocumentCode :
3583161
Title :
Link spam detection based on genetic programming
Author :
Niu, Xiaofei ; Li, Shengen ; Niu, Xuedong ; Yuan, Ning ; Zhu, Cuiling
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Jianzhu Univ., Jinan, China
Volume :
7
fYear :
2010
Firstpage :
3359
Lastpage :
3363
Abstract :
Link spam refers to unfairly gaining a high ranking on search engines for a web page by means of trickily manipulating the link graph to confuse the hyper-link structure analysis algorithms. It seriously affects the quality of the search engine query results. Detecting link spam has become a big challenge for web search. This paper proposes to learn a discriminant function to detect link spam by genetic programming. In this article, the representation of individuals, the genetic operators and the fitness function are studied. The experiments on WEBSPAM-UK2006 are carried out to find the preferable parameters and evaluate the validity of genetic programming. The experimental results show that this method can improve spam classification recall by 27.5%, F-measure by 12.1% and accuracy by 4.6% compared with SVM.
Keywords :
genetic algorithms; query processing; search engines; unsolicited e-mail; WEBSPAM-UK2006; Web page; Web search; discriminant function; fitness function; genetic programming; hyper-link structure analysis algorithms; link graph manipulation; link spam detection; search engine query; spam classification; Accuracy; Binary trees; Feature extraction; Genetic programming; Search engines; Unsolicited electronic mail; Web pages; Genetic Programming; Information Retrieval; Link Spam;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Computation (ICNC), 2010 Sixth International Conference on
Print_ISBN :
978-1-4244-5958-2
Type :
conf
DOI :
10.1109/ICNC.2010.5583657
Filename :
5583657
Link To Document :
بازگشت