Title :
Intrusion feature selection using Modified Heuristic Greedy Algorithm of Itemset
Author :
Onpans, Janya ; Rasmequan, Suwanna ; Jantarakongkul, Benchaporn ; Chinnasarn, Krisana ; Rodtook, Annupan
Author_Institution :
Fac. of Inf., Burapha Univ., Chonburi, Thailand
Abstract :
This paper proposes the Modified Heuristic Greedy Algorithm of Itemset (MHGIS) as a feature selection method for Network Intrusion Data. The proposed method can be use as an alternative method to gain the proper attributes for the proposed domain data: Network Intrusion Data. MHGIS is modified from original Heuristic Greedy Algorithm of Itemset (HGIS) to increase efficiency for finding proper feature. In our work, we compare our result with the common method of feature selection is which the Chi-Square (Chi2) feature selection. There are 4 main steps in our experiment: Firstly, we start with data pre-processing to discard unnecessary attributes. Secondly, MHGIS feature selection and Chi2 feature selection have been employed on the pre-processed data, to reduce the number of attributes. Thirdly, we measure the recognition performance by using supervised learning algorithms which are C4.5, BPNN, RBF and SVM. Lastly, we evaluate the results received from MHGIS and Chi2. From the KDDCup99 dataset, we got 13,499 randomly sampling patterns with 34 data dimensions. With the use of MHGIS and Chi2 algorithms, we obtain 14 and 26 features respectively. The result shows that, the classification accuracies measure by C4.5 over the MHGIS selection algorithm produces better accuracies as compare to the Chi2 feature selection and HGIS feature selection over all types of classification methods.
Keywords :
greedy algorithms; learning (artificial intelligence); pattern classification; radial basis function networks; security of data; support vector machines; BPNN algorithm; C4.5 algorithm; Chi2 feature selection; KDDCup99 dataset; MHGIS feature selection; RBF algorithm; SVM algorithm; attributes reduction; backpropagation neural network; chi-square feature selection; classification accuracy; data pre-processing; intrusion feature selection; modified heuristic greedy algorithm of itemset; network intrusion data; radial basis function network; recognition performance; supervised learning algorithms; support vector machines; Accuracy; Feature extraction; Greedy algorithms; Intrusion detection; Itemsets; Principal component analysis; Support vector machines; Feature Selection; Heuristc Greedy; Network Intrusion Detection; Pattern Recognition;
Conference_Titel :
Communications and Information Technologies (ISCIT), 2013 13th International Symposium on
Conference_Location :
Surat Thani
Print_ISBN :
978-1-4673-5578-0
DOI :
10.1109/ISCIT.2013.6645936