Title :
Generic feature selection measure for botnet malware detection
Author :
Berg, P.E. ; Franke, Katrin ; Hai Thanh Nguyen
Author_Institution :
Dept. of Comput. Sci. & Media Technol., Gjovik Univ. Coll., Gjövik, Norway
Abstract :
Feature selection for botnet malware detection is an important task. In this paper, we study the recently proposed Generic-Feature-Selection (GeFS) measure [18]. Since there is no benchmark dataset of botnet malware, we conduct experiments on the dataset that is generated by using public available tools. We utilize the static and dynamic approaches [24], [29], [12] to extract features from the generated dataset and to produce two separate feature sets. We analyze the statistical properties of these feature sets to provide more insights of their nature and quality. Subsequently we determine appropriate instances of the GeFS measure for feature selection. The GeFS measure was compared experimentally with two different methods regarding the feature selection capabilities in botnet malware detection: the genetic-algorithm-CFS and the best-first-CFS algorithms. We use five different classifiers to test the detection rates and false positive rates. The experiments show that we can remove 99.9% of irrelevant and redundant features from the datasets, while keeping or yielding even better classification performances. Moreover, the GeFS measure outperforms the genetic-algorithm-CFS and the best-first-CFS methods by removing much more redundant features.
Keywords :
feature extraction; invasive software; linear programming; statistical analysis; tree searching; GeFS measure; best-first-CFS algorithm; botnet malware detection; branch-and-bound; dynamic approaches; feature extraction; feature sets; generic feature selection measure; genetic-algorithm-CFS algorithm; mixed 0-1 linear programming; static approaches; statistical properties; Correlation; Data mining; Feature extraction; Intrusion detection; Libraries; Malware; Mutual information; botnets; branch and bound; feature selection; machine learning; malware analysis; mixed 0 – 1 linear programming;
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2012 12th International Conference on
Conference_Location :
Kochi
Print_ISBN :
978-1-4673-5117-1
DOI :
10.1109/ISDA.2012.6416624