Title :
A hybrid heuristics-statistical peer-to-peer traffic classifier
Author :
Hassan, Mohammad ; Marsono, M.N.
Author_Institution :
Fac. of Electr. Eng., Univ. Teknol. Malaysia, Skudai, Malaysia
Abstract :
Peer-to-peer (P2P) traffic consumes a significant chunk of Internet bandwidth that requires effective control. This work proposes a novel hybrid heuristics-statistical approach to classify P2P traffic. Heuristics approach provides highly accurate P2P detection, although it involves measuring and analyzing of many correlations between packets and flows for certain duration of time, which make it inapplicable for online P2P traffic classification. On the other hand, statistical classification can classify traffic in an online manner although it needs periodical, often manual, retraining. The proposed hybrid solution merges these two approaches: offline heuristics learning corpus generation and online statistical classification. In the first part, heuristics are used to classify traffic flows into three classes, two which are later used for training the online statistical classifier. This work presents an enhancement on the existing heuristics P2P classification by adding a new class for unknown traffic. Analyses on the offline traces using the improved heuristics show that the addition of the third class reduces the class noise from 7% to 2%, hence, providing quality examples to retrain the online statistical classifier. For the second part, machine learning (ML) algorithms are used to classify traffic on the fly based on the flows and packets statistics. Using examples generated by the heuristics classifier, the overall statistical classification accuracy is 99% based on analysis on downloaded and captured traces.
Keywords :
Internet; learning (artificial intelligence); pattern classification; peer-to-peer computing; statistical analysis; telecommunication traffic; Internet bandwidth; P2P detection; flow correlation; hybrid heuristics-statistical peer-to-peer traffic classifier; machine learning algorithm; offline heuristics learning corpus generation; offline trace; online P2P traffic classification; online statistical classification; packet correlation; packet statistics; trace analysis; unknown traffic; Accuracy; Classification algorithms; Clustering algorithms; Payloads; Peer to peer computing; Ports (Computers); Training; Heuristics Classification; Hybrid Classifier; Machine Learning; Peer-to-peer;
Conference_Titel :
Computer Systems and Industrial Informatics (ICCSII), 2012 International Conference on
Conference_Location :
Sharjah
Print_ISBN :
978-1-4673-5155-3
DOI :
10.1109/ICCSII.2012.6454475