• DocumentCode
    2102346
  • Title

    Decision Tree Network Traffic Classifier Via Adaptive Hierarchical Clustering for Imperfect Training Dataset

  • Author

    Lin, Ping ; Lei, Zhenming ; Chen, Luying ; Yang, Jie ; Liu, Fang

  • Author_Institution
    Key Lab. of Inf. Process. & Intell. Technol., Beijing Univ. of Posts & Telecommun., Beijing, China
  • fYear
    2009
  • fDate
    24-26 Sept. 2009
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Existing network traffic classifiers often assume the availability of ideal training dataset. Yet in practice, the training dataset may contain a substantial number of flows labeled as ´unknown´, including both the flows from classes that are not modeled by the classifier, and the unrecognized flows from modeled classes. Such training dataset will seriously degrade the recall rate and generalization capability of existing classifiers treating unknowns just as a normal class. In this paper, we propose a semi-supervised multivariate decision tree classification algorithm, based on adaptive hierarchical clustering. Rather than using Gini index or information gain relying on perfect training dataset, we use adaptive hierarchical clustering, to construct the decision tree. The clustering process can identify unknown flows belonging modeled classes, avoiding the pitfalls of existing algorithms treating them equally as real unknowns. After mapping each leaf cluster to a class based on its majority members, and assigning decision rules based on cluster centers, we get a multivariate decision tree. The experiment result shows that our algorithm can significantly improve the recall rate of flows belonging to modeled classes compared to a decision tree classifier, with only small impact on precision.
  • Keywords
    decision trees; pattern clustering; telecommunication network management; telecommunication traffic; adaptive hierarchical clustering; decision rules; decision tree network traffic classifier; semisupervised multivariate decision tree classification algorithm; Classification tree analysis; Clustering algorithms; Communication system traffic control; Cryptography; Decision trees; Internet; Payloads; Quality of service; Telecommunication traffic; Traffic control;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Wireless Communications, Networking and Mobile Computing, 2009. WiCom '09. 5th International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-3692-7
  • Electronic_ISBN
    978-1-4244-3693-4
  • Type

    conf

  • DOI
    10.1109/WICOM.2009.5302133
  • Filename
    5302133