DocumentCode :
775953
Title :
Efficient Mining of the Multidimensional Traffic Cluster Hierarchy for Digesting, Visualization, and Anomaly Identification
Author :
Wang, Jisheng ; Miller, David J. ; Kesidis, George
Author_Institution :
Pennsylvania State Univ., University Park, PA
Volume :
24
Issue :
10
fYear :
2006
Firstpage :
1929
Lastpage :
1941
Abstract :
Mining traffic to identify the dominant flows sent over a given link, over a specified time interval, is a valuable capability with applications to traffic auditing, simulation, visualization, as well as anomaly detection. Recently, Estan advanced a comprehensive data mining structure tailored for networking data-a parsimonious, multidimensional flow hierarchy, along with an algorithm for its construction. While they primarily targeted offline auditing, use in interactive traffic visualization and anomaly/attack detection will require real-time data mining. We suggest several improvements to Estan´s algorithm that substantially reduce the computational complexity of multidimensional flow mining. We also propose computational and memory-efficient approaches for unidimensional clustering of the IP address spaces. For baseline implementations, evaluated on the New Zealand (NZIX) trace data, our method reduced CPU execution times of the Estan method by a factor of more than eight. We also develop a methodology for anomaly/attack detection based on flow mining, demonstrating the usefulness of this approach on traces from the Slammer and Code Red worms and the MIT Lincoln Laboratories DDoS data
Keywords :
IP networks; computer viruses; data mining; data visualisation; multidimensional systems; telecommunication security; telecommunication traffic; DDoS data; IP address space; MIT Lincoln Laboratories; NZIX; New Zealand trace data; anomaly identification; code red worm; computational approach; data mining traffic; data visualization; distributed denial of service; memory-efficient approach; multidimensional flow hierarchy; slammer worm; unidimensional clustering; Clustering algorithms; Computational complexity; Context modeling; Data mining; Data visualization; Fluid flow measurement; Multidimensional systems; Telecommunication traffic; Traffic control; Uncertainty; Data digesting; data visualization; frequent item set mining; hierarchical clustering; network anomaly detection;
fLanguage :
English
Journal_Title :
Selected Areas in Communications, IEEE Journal on
Publisher :
ieee
ISSN :
0733-8716
Type :
jour
DOI :
10.1109/JSAC.2006.877216
Filename :
1705623
Link To Document :
بازگشت