Title : 
Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks
         
        
            Author : 
Nguyen, Thuy T T ; Armitage, Grenville
         
        
            Author_Institution : 
Centre for Adv. Internet Archit., Swinburne Univ. of Technol., Melbourne, Vic.
         
        
        
        
        
            Abstract : 
Literature on the use of machine learning (ML) algorithms for classifying IP traffic has relied on full-flows or the first few packets of flows. In contrast, many real-world scenarios require a classification decision well before a flow has finished even if the flow´s beginning is lost. This implies classification must be achieved using statistics derived from the most recent N packets taken at any arbitrary point in a flow´s lifetime. We propose training the classifier on a combination of short sub-flows (extracted from full-flow examples of the target application´s traffic). We demonstrate this optimisation using the naive Bayes ML algorithm, and show that our approach results in excellent performance even when classification is initiated mid-way through a flow with windows as small as 25 packets long. We suggest future use of unsupervised ML algorithms to identify optimal sub-flows for training
         
        
            Keywords : 
Bayes methods; IP networks; learning (artificial intelligence); telecommunication traffic; IP network; IP traffic; machine learning classifier; naive Bayes algorithm; Government; IP networks; Inspection; Intrusion detection; Machine learning; Machine learning algorithms; Payloads; Protocols; TCPIP; Telecommunication traffic;
         
        
        
        
            Conference_Titel : 
Local Computer Networks, Proceedings 2006 31st IEEE Conference on
         
        
            Conference_Location : 
Tampa, FL
         
        
        
            Print_ISBN : 
1-4244-0418-5
         
        
            Electronic_ISBN : 
0742-1303
         
        
        
            DOI : 
10.1109/LCN.2006.322122