DocumentCode
1843030
Title
Active learning with neural networks for intrusion detection
Author
Seliya, Naeem ; Khoshgoftaar, Taghi M.
Author_Institution
Comput. & Inf. Sci., Univ. of Michigan-Dearborn, Dearborn, MI, USA
fYear
2010
fDate
4-6 Aug. 2010
Firstpage
49
Lastpage
54
Abstract
This paper presents a neural-network-based active learning procedure for computer network intrusion detection. Applying data mining and machine learning techniques to network intrusion detection often faces the problem of very large training dataset size. For example, the training dataset commonly used for the DARPA KDD-1999 offline intrusion detection project contained approximately five hundred thousand (10% sample of the original five million) observations, which were used to build intrusion detection classification models. The practical problems associated with such a large dataset include very long model training times, redundant information, and increased complexity in understanding the domain-specific data. We demonstrate that a simple active learning procedure can dramatically reduce the size of the training data, without significantly sacrificing the classification accuracy of the intrusion detection model. A case study of the DARPA KDD-1999 intrusion detection project is used in our work. The network traffic instances are classified into one of two categories - normal and attack. A comparison of the actively trained neural network model with a C4.5 decision tree indicated that the actively learned model had better generalization accuracy. In addition, the training data classification performance of the actively learned model was comparable to that of the C4.5 decision tree.
Keywords
data mining; decision trees; learning (artificial intelligence); neural nets; security of data; C4.5 decision tree; DARPA KDD-1999; active learning; computer network intrusion detection; data mining; machine learning techniques; network traffic instances; neural networks; very large training dataset size; Artificial neural networks; Biological system modeling; Data models; Intrusion detection; Machine learning; Training; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration (IRI), 2010 IEEE International Conference on
Conference_Location
Las Vegas, NV
Print_ISBN
978-1-4244-8097-5
Type
conf
DOI
10.1109/IRI.2010.5558967
Filename
5558967
Link To Document