Title :
An integrated machine learning and control theoretic model for mining concept-drifting data streams
Author :
Shetty, Sachin ; Mukkavilli, Sai Kiran ; Keel, L.H.
Author_Institution :
Dept. of Electr. & Comput. Eng., Tennessee State Univ., Nashville, TN, USA
Abstract :
Anomaly-based network Intrusion Detection Systems (IDS) model patterns of normal activity and detect novel network attacks. However, these systems depend on the availability of the systems normal traffic pattern profile. But the statistical fingerprint of the normal traffic pattern can change and shift over a period of time due to changes in operational or user activity at the networked site or even system updates. The changes in normal traffic patterns over time lead to concept drift. Some changes can be temporal, cyclical and can be short-lived or they can last for longer periods of time. Depending on a number of factors the speed at which the change in traffic patterns occurs can also be variable, ranging from near instantaneous to the change occurring over the span of numerous months. These changes in traffic patterns are a cause of concern for IDSs as they can lead to a significant increase in false positive rates, thereby reducing the overall system performance. In order to improve the reliability of the IDS, there is a need for an automated mechanism to detect valid traffic changes and avoid inappropriate ad hoc responses. ROC curves have historically been used to evaluate the accuracy of IDSs. ROC curves generated using fixed, time-invariant classification thresholds do not characterize the best accuracy that an IDS can achieve in presence of concept-drifting network traffic. In this paper, we present a integrated supervised machine learning and control theoretic model for detecting concept drift in network traffic patterns. The model comprises of a online support vector machine based classifier(incremental anomaly based detection), a Kullback - Leibler divergence based relative entropy measurement scheme(quantifying concept drift) and feedback control engine(adapting ROC thresholding). In our proposed system, any intrusion activity will cause significant variations, thereby causing a large error, while a minor aberration in the variations (concept drift) w- ll not be immediately reported as alert.
Keywords :
data mining; feedback; learning (artificial intelligence); pattern classification; security of data; support vector machines; Kullback-Leibler divergence; ROC curve; adapting ROC thresholding; anomaly-based network intrusion detection systems; concept drift detection; concept-drifting data stream mining; control theoretic model; entropy measurement scheme; feedback control engine; incremental anomaly based detection; network attack; normal traffic pattern profile; receiver operating characteristic curve; supervised machine learning; support vector machine based classifier; user activity; Accuracy; Adaptation models; Engines; Entropy; Feedback control; Support vector machines; Training; Anomaly Based Intrusion Detection Systems; Concept Drift; Support Vector Machine;
Conference_Titel :
Technologies for Homeland Security (HST), 2011 IEEE International Conference on
Conference_Location :
Waltham, MA
Print_ISBN :
978-1-4577-1375-0
DOI :
10.1109/THS.2011.6107850