Title :
Optimal supervised feature extraction in internet traffic classification
Author :
Aliakbarian, M. Sadegh ; Fanian, Ali ; Saleh, Fatemeh Sadat ; Gulliver, T.A.
Author_Institution :
Dept. of Electr. & Comput. Eng., Isfahan Univ. of Technol., Isfahan, Iran
Abstract :
Internet traffic classification is important in many aspects of network management such as data exploitation detection, malicious user identification, and restricting application traffic. Previously, features such as port and protocol numbers have been used to classify traffic, but these features can now be changed easily, making their use in traffic classification inadequate. Consequently, traffic classification based on machine learning (ML) is now employed. The number of features used in an ML algorithm has a significant impact on performance, in particular accuracy. In this paper, a minimum best feature set is chosen using a supervised method to obtain uncorrelated features. Outlier removal and data normalization is used to reduce the dimensionality. The data projected into the resulting space is then used to construct the classifier input. Finally, the decision tree, artificial neural network and naïve Bayesian single classifier algorithms, and the bagging and boosting ensemble algorithms, are used for traffic classification. Results are presented which show that the feature space dimension can be reduced to M-1, where M is the number of classes, with no loss in class separability.
Keywords :
Internet; computer network management; feature extraction; learning (artificial intelligence); pattern classification; telecommunication traffic; Internet traffic classification; ML algorithm; application traffic restriction; artificial neural network; bagging-boosting ensemble algorithms; class separability; data exploitation detection; data normalization; decision tree; dimensionality reduction; feature space dimension; machine learning; malicious user identification; minimum best feature set; naive Bayesian single-classifier algorithm; network management; optimal supervised feature extraction; outlier removal; port number; protocol number; Accuracy; Artificial neural networks; Classification algorithms; Eigenvalues and eigenfunctions; Internet; Machine learning algorithms; Protocols; Dimensionality reduction; Feature selection; Machine learning; Traffic classification;
Conference_Titel :
Communications, Computers and Signal Processing (PACRIM), 2013 IEEE Pacific Rim Conference on
Conference_Location :
Victoria, BC
DOI :
10.1109/PACRIM.2013.6625457