DocumentCode
2529149
Title
Minimal dataset for Network Intrusion Detection Systems via dimensionality reduction
Author
Nziga, Jean-Pierre
Author_Institution
Grad. Sch. of Comput. & Inf. Sci., Nova Southeastern Univ., Fort Lauderdale, FL, USA
fYear
2011
fDate
26-28 Sept. 2011
Firstpage
168
Lastpage
173
Abstract
Network Intrusion Detection Systems (NIDS) monitor internet traffic to detect malicious activities including but not limited to denial of service attacks, network accesses by unauthorized users, attempts to gain additional privileges and port scans. The amount of data that must be analyzed by NIDS is too large. Prior studies developed feature selection and feature extraction techniques to reduce the size of data. None has focused on finding exactly by how much the dataset should be reduced. Dimensionality reduction is a field in machine learning that consists on mapping high dimensional data into lower dimension while preserving important features of the original dataset. Dimensionality reduction techniques have been used to reduce the amount of data in applications such as speech signals, digital photographs, fMRI scans, DNA microarrays, Hyper spectral data. The purpose of this paper is to find the finite amount of data required for successful intrusion detection. This evaluation is necessary to improve the efficiency of NIDS in identifying existing attack patterns and recognizing new intrusion in real-time. Two dimensionality reduction techniques are used one linear technique (Principal Component Analysis) and one non-linear technique (Multidimensional Scaling). Data is then submitted to two classification algorithms J48 (C.45) and Naïve Bayes. This study was conducted using the KDD Cup 99 data. Experimental results show optimal performance with reduced datasets of 4 dimensions for J48 and 12 dimensions for Naïve Bayes.
Keywords
Internet; computer network security; learning (artificial intelligence); principal component analysis; telecommunication traffic; C.45; Internet traffic; J48; NIDS; classification algorithm; denial of service attack; dimensionality reduction; feature extraction; feature selection; machine learning; multidimensional scaling; naive Bayes method; network intrusion detection system; nonlinear technique; principal component analysis; Accuracy; Algorithm design and analysis; Classification algorithms; Covariance matrix; Feature extraction; Intrusion detection; Principal component analysis; Dimensionality Reduction; Intrusion Detection; KDD; Multidimensional Scaling; Principal Component Analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Information Management (ICDIM), 2011 Sixth International Conference on
Conference_Location
Melbourn, QLD
ISSN
Pending
Print_ISBN
978-1-4577-1538-9
Type
conf
DOI
10.1109/ICDIM.2011.6093368
Filename
6093368
Link To Document