DocumentCode
618135
Title
Analyzing string format-based classifiers for botnet detection: GP and SVM
Author
Haddadi, Fariba ; Zincir-Heywood, A. Nur
Author_Institution
Comput. Sci., Dalhousie Univ., Halifax, NS, Canada
fYear
2013
fDate
20-23 June 2013
Firstpage
2626
Lastpage
2633
Abstract
The domain name system (DNS) is an essential component of Internet. As it is expected to be used by all legitimate users and applications, generally there are less inspections, restrictions and filters on it. Botnets rely on this open component to accomplish their malicious operation. Therefore, to defeat the single point of failure and evade static blacklists and firewalls, they employ DNS-based methods to frequently generate new automatic domain names. Stateful-SBB, which is a form of genetic programming (GP), was previously designed and developed by the authors to detect these automatically generated domain names based on minimum a priori information which was shown efficient. In this paper, we compare Stateful-SBB against the String Subsequence Kernel (SSK) and SSK with Lambda Pruning (SSK-LP), which are based on support vector machines (SVM) and also use string format inputs. Analyzing the domain names that each of the classifiers chooses as a part of their solutions in the classification process, we notice that 50% to 63% of the Stateful-SBBs´ frequently selected points on the Pareto-front are also used by SSK and SSK-LP, respectively. By analyzing these common domain names, we identify some of the characteristics of the botnet domain names. Moreover, we introduce a pruned version of the Stateful-SBB that resulted in reducing the solution complexity by 83% with the same high accuracy.
Keywords
Internet; data analysis; genetic algorithms; pattern classification; security of data; support vector machines; DNS-based method; GP; Internet; SSK with lambda pruning; SVM; Stateful-SBB; botnet detection; classification process; classifier analysis; domain name system; genetic programming; string format input; string format-based classifier; string subsequence kernel; support vector machines; Computers; Feature extraction; Internet; Kernel; Servers; Support vector machines; Training; botnet domain name detection; evolutionary computation; genetic programming;
fLanguage
English
Publisher
ieee
Conference_Titel
Evolutionary Computation (CEC), 2013 IEEE Congress on
Conference_Location
Cancun
Print_ISBN
978-1-4799-0453-2
Electronic_ISBN
978-1-4799-0452-5
Type
conf
DOI
10.1109/CEC.2013.6557886
Filename
6557886
Link To Document