Title :
QRP05-4: Internet Traffic Identification using Machine Learning
Author :
Erman, Jeffrey ; Mahanti, Anirban ; Arlitt, Martin
Author_Institution :
Dept. of Comput. Sci., Univ. of Calgary, Calgary, AB
fDate :
Nov. 27 2006-Dec. 1 2006
Abstract :
We apply an unsupervised machine learning approach for Internet traffic identification and compare the results with that of a previously applied supervised machine learning approach. Our unsupervised approach uses an expectation maximization (EM) based clustering algorithm and the supervised approach uses the naive Bayes classifier. We find the unsupervised clustering technique has an accuracy up to 91% and outperform the supervised technique by up to 9%. We also find that the unsupervised technique can be used to discover traffic from previously unknown applications and has the potential to become an excellent tool for exploring Internet traffic.
Keywords :
Bayes methods; Internet; expectation-maximisation algorithm; telecommunication computing; telecommunication traffic; unsupervised learning; Internet traffic identification; expectation maximization based clustering algorithm; naive Bayes classifier; supervised machine learning; unsupervised clustering; unsupervised machine learning; Clustering algorithms; Computer science; IP networks; Internet; Machine learning; Machine learning algorithms; Payloads; Peer to peer computing; Telecommunication traffic; Training data;
Conference_Titel :
Global Telecommunications Conference, 2006. GLOBECOM '06. IEEE
Conference_Location :
San Francisco, CA
Print_ISBN :
1-4244-0356-1
Electronic_ISBN :
1930-529X
DOI :
10.1109/GLOCOM.2006.443