DocumentCode :
2709936
Title :
Learning on Class Imbalanced Data to Classify Peer-to-Peer Applications in IP Traffic using Resampling Techniques
Author :
Zhong, Weicai ; Raahemi, Bijan ; Liu, Jing
Author_Institution :
Telfer Sch. of Manage., Univ. of Ottawa, Ottawa, ON, Canada
fYear :
2009
fDate :
14-19 June 2009
Firstpage :
3548
Lastpage :
3554
Abstract :
In many applications, one class of data is presented by a large number of examples while the other only by a few. For instance, in our previous works on identification of peer-to-peer (P2P) Internet traffics, we observed that only about 30% of examples can be labeled as ldquoP2Prdquo using a port-based heuristic rule, and even fewer examples can be labeled in the future as more and more P2P applications use dynamic ports. In this paper, the effect of three resampling techniques on balancing the class distribution in training C4.5 and neural networks for identifying P2P traffic is studied. The experimental data were captured at our campus gateway. Nine datasets with different percentages of ldquoP2Prdquo examples and six datasets of different sizes with an actual percentage of about 30% of ldquoP2Prdquo examples are used in the experiments. The results show that resampling techniques are effective and stable, and random over-sampling is a quite good choice for P2P traffic identification considering a combination of the classification performance and time complexity.
Keywords :
IP networks; Internet; neural nets; peer-to-peer computing; telecommunication traffic; IP traffic; Internet; class imbalanced data learning; peer-to-peer application; port-based heuristic rule; resampling technique; Bandwidth; Communication system traffic control; Data mining; Internet; Labeling; Neural networks; Peer to peer computing; Predictive models; Telecommunication computing; Telecommunication traffic;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2009. IJCNN 2009. International Joint Conference on
Conference_Location :
Atlanta, GA
ISSN :
1098-7576
Print_ISBN :
978-1-4244-3548-7
Electronic_ISBN :
1098-7576
Type :
conf
DOI :
10.1109/IJCNN.2009.5178804
Filename :
5178804
Link To Document :
بازگشت