Optimal supervised feature extraction in internet traffic classification

Author

Aliakbarian, M. Sadegh ; Fanian, Ali ; Saleh, Fatemeh Sadat ; Gulliver, T.A.

Author_Institution

Dept. of Electr. & Comput. Eng., Isfahan Univ. of Technol., Isfahan, Iran

fYear

2013

fDate

27-29 Aug. 2013

Firstpage

102

Lastpage

107

Abstract

Internet traffic classification is important in many aspects of network management such as data exploitation detection, malicious user identification, and restricting application traffic. Previously, features such as port and protocol numbers have been used to classify traffic, but these features can now be changed easily, making their use in traffic classification inadequate. Consequently, traffic classification based on machine learning (ML) is now employed. The number of features used in an ML algorithm has a significant impact on performance, in particular accuracy. In this paper, a minimum best feature set is chosen using a supervised method to obtain uncorrelated features. Outlier removal and data normalization is used to reduce the dimensionality. The data projected into the resulting space is then used to construct the classifier input. Finally, the decision tree, artificial neural network and naïve Bayesian single classifier algorithms, and the bagging and boosting ensemble algorithms, are used for traffic classification. Results are presented which show that the feature space dimension can be reduced to M-1, where M is the number of classes, with no loss in class separability.

Keywords

Internet; computer network management; feature extraction; learning (artificial intelligence); pattern classification; telecommunication traffic; Internet traffic classification; ML algorithm; application traffic restriction; artificial neural network; bagging-boosting ensemble algorithms; class separability; data exploitation detection; data normalization; decision tree; dimensionality reduction; feature space dimension; machine learning; malicious user identification; minimum best feature set; naive Bayesian single-classifier algorithm; network management; optimal supervised feature extraction; outlier removal; port number; protocol number; Accuracy; Artificial neural networks; Classification algorithms; Eigenvalues and eigenfunctions; Internet; Machine learning algorithms; Protocols; Dimensionality reduction; Feature selection; Machine learning; Traffic classification;

fLanguage

English

Publisher

ieee

Conference_Titel

Communications, Computers and Signal Processing (PACRIM), 2013 IEEE Pacific Rim Conference on

Conference_Location

Victoria, BC

ISSN

1555-5798

Type

conf

DOI

10.1109/PACRIM.2013.6625457

Filename

6625457