• DocumentCode
    3433334
  • Title

    Optimal supervised feature extraction in internet traffic classification

  • Author

    Aliakbarian, M. Sadegh ; Fanian, Ali ; Saleh, Fatemeh Sadat ; Gulliver, T.A.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Isfahan Univ. of Technol., Isfahan, Iran
  • fYear
    2013
  • fDate
    27-29 Aug. 2013
  • Firstpage
    102
  • Lastpage
    107
  • Abstract
    Internet traffic classification is important in many aspects of network management such as data exploitation detection, malicious user identification, and restricting application traffic. Previously, features such as port and protocol numbers have been used to classify traffic, but these features can now be changed easily, making their use in traffic classification inadequate. Consequently, traffic classification based on machine learning (ML) is now employed. The number of features used in an ML algorithm has a significant impact on performance, in particular accuracy. In this paper, a minimum best feature set is chosen using a supervised method to obtain uncorrelated features. Outlier removal and data normalization is used to reduce the dimensionality. The data projected into the resulting space is then used to construct the classifier input. Finally, the decision tree, artificial neural network and naïve Bayesian single classifier algorithms, and the bagging and boosting ensemble algorithms, are used for traffic classification. Results are presented which show that the feature space dimension can be reduced to M-1, where M is the number of classes, with no loss in class separability.
  • Keywords
    Internet; computer network management; feature extraction; learning (artificial intelligence); pattern classification; telecommunication traffic; Internet traffic classification; ML algorithm; application traffic restriction; artificial neural network; bagging-boosting ensemble algorithms; class separability; data exploitation detection; data normalization; decision tree; dimensionality reduction; feature space dimension; machine learning; malicious user identification; minimum best feature set; naive Bayesian single-classifier algorithm; network management; optimal supervised feature extraction; outlier removal; port number; protocol number; Accuracy; Artificial neural networks; Classification algorithms; Eigenvalues and eigenfunctions; Internet; Machine learning algorithms; Protocols; Dimensionality reduction; Feature selection; Machine learning; Traffic classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications, Computers and Signal Processing (PACRIM), 2013 IEEE Pacific Rim Conference on
  • Conference_Location
    Victoria, BC
  • ISSN
    1555-5798
  • Type

    conf

  • DOI
    10.1109/PACRIM.2013.6625457
  • Filename
    6625457