Title :
Minimal dataset for Network Intrusion Detection Systems via MID-PCA: A hybrid approach
Author :
Nziga, Jean-Pierre ; Cannady, James
Author_Institution :
Grad. Sch. of Comput. & Inf. Sci., Nova Southeastern Univ., Fort Lauderdale, FL, USA
Abstract :
Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities. Unfortunately, the amount of data that must be analyzed by NIDS is too large. Several feature selection and feature extraction techniques have been proposed to reduce the size of data. Few are focused on finding exactly by how much the dataset should be reduced. The purpose of this paper is to contribute to the finding of that finite amount of data required for successful intrusion detection. A new hybrid algorithm MID-PCA combining PCA (Principal Component Analysis) and mRMR (minimum Redundancy Maximum Relevance - MID evaluation criteria) is proposed. PCA is first applied to the original dataset. Then, mRMR-MID is applied to the intermediary output to further reduce redundancy and maximize relevancy. An exhaustive evaluation of the MID-PCA algorithm is conducted with the KDD Cup´99, a used widely dataset in the network security community. MID-PCA performance was compared to that of PCA and mRMR using two classifiers namely J48 (C4.5) and BayesNet. Experimental results assert the effectiveness of the newly proposed algorithm MID-PCA for NIDS feature extraction compared with PCA and Mutual Information. The newly proposed MID-PCA shows better performance and classification accuracies with reduced datasets of only 4 dimensions for BayesNet (99.77%) and 6 dimensions for J48 (99.94%). This is an improvement over PCA which achieves similar classification accuracy with 12 principal components (twelve dimensions). An extension of this paper will conduct broader experiments using other datasets, then compare results to that of several well known feature reduction algorithms to confirm the superiority of MID-PCA.
Keywords :
Internet; computer network security; data analysis; feature extraction; pattern classification; principal component analysis; redundancy; telecommunication traffic; BayesNet classifier comparison; Internet traffic monitoring; J48 classifier comparison; KDD Cup´99; MID evaluation criteria; MID-PCA hybrid algorithm; NIDS; classification accuracy; data size reduction; feature extraction technique; feature selection technique; mRMR; malicious activity detection; minimum redundancy maximum relevance; mutual information; network intrusion detection system; principal component analysis; redundancy reduction; relevancy maximization; Accuracy; Algorithm design and analysis; Classification algorithms; Feature extraction; Intrusion detection; Mutual information; Principal component analysis; Dimensionality Reduction; Intrusion Detection; KDD; Mutual Information; Principal Component Analysis;
Conference_Titel :
Intelligent Systems (IS), 2012 6th IEEE International Conference
Conference_Location :
Sofia
Print_ISBN :
978-1-4673-2276-8
DOI :
10.1109/IS.2012.6335176