• DocumentCode
    2560754
  • Title

    A statistical-feature-based approach to internet traffic classification using Machine Learning

  • Author

    Huang, Shijun ; Chen, Kai ; Liu, Chao ; Liang, Alei ; Guan, Haibing

  • Author_Institution
    Sch. of Inf. Security Eng., Shanghai Jiao Tong Univ., Shanghai, China
  • fYear
    2009
  • fDate
    12-14 Oct. 2009
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    This Internet traffic classification using Machine Learning is an emerging research field since 1990´s, and now it is widely used in numerous network activities. The classification technique focuses on modeling attributes and features of data flows to accomplish the identification of applications. In the paper we design and implement the classification model based on header-derived flow statistical features. Compared with the traditional methods, the model designed here, which is totally insensitive to port numbers and contents of payload on application level, overcomes difficulty in operation caused by unreliable port numbers and complexity of payload interpretation. Rather than relatively complex ML algorithms or even in mixture, supervised k-Nearest Neighbor estimator is adopted for the sake of computational efficiency, along with the effective and easy-to-calculate statistical features selected according to the operational background. Our results indicate that about 90% accuracy on per-flow classification can be achieved, which is a vast improvement over traditional techniques that achieve 50-70%.
  • Keywords
    Internet; computational complexity; learning (artificial intelligence); pattern classification; telecommunication computing; telecommunication traffic; Internet traffic classification; computational efficiency; header-derived flow statistical feature; machine learning; statistical feature based approach; supervised k-nearest neighbor estimator; Algorithm design and analysis; Classification algorithms; Computational efficiency; Cryptography; IP networks; Machine learning; Payloads; Telecommunication traffic; Testing; Web and internet services; Machine Learning; flow features; k-Nearest Neighbor; traffic classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Ultra Modern Telecommunications & Workshops, 2009. ICUMT '09. International Conference on
  • Conference_Location
    St. Petersburg
  • Print_ISBN
    978-1-4244-3942-3
  • Electronic_ISBN
    978-1-4244-3941-6
  • Type

    conf

  • DOI
    10.1109/ICUMT.2009.5345539
  • Filename
    5345539