• DocumentCode
    2112649
  • Title

    An Improved Algorithm of Decision Trees for Streaming Data Based on VFDT

  • Author

    Li, Feixiong ; Liu, Quan

  • Author_Institution
    Provincial Key Lab. for Comput. Inf. Process. Technol., Soochow Univ. Suzhou, Suzhou
  • Volume
    1
  • fYear
    2008
  • fDate
    20-22 Dec. 2008
  • Firstpage
    597
  • Lastpage
    600
  • Abstract
    Decision tree is a good model of Classification. Recently, there has been much interest in mining streaming data. Because streaming data is large and no limited, it is unpractical that passing the entire data over more than one time. A one pass online algorithm is necessary. One of the most successful algorithms for mining data streams is VFDT(Very Fast Decision Tree).we extend the VFDT system to EVFDT(Efficient-VFDT) in two directions: (1)We present Uneven Interval Numerical Pruning (shortly UINP) approach for efficiently processing numerical attributes. (2)We use naive Bayes classifiers associated with the node to process the samples to detect the outlying samples and reduce the scale of the trees. From the experimental comparison, the two techniques significantly improve the efficiency and the accuracy of decision tree construction on streaming data.
  • Keywords
    Bayes methods; data mining; decision trees; pattern classification; Naive Bayes classifier; UINP approach; VFDT system; data stream mining; one pass online algorithm; uneven interval numerical pruning; very fast decision tree; Decision Trees; Naive Bayes Classifiers; Streaming Data Mining; Unequal Interval Numerical Pruning(UINP);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Engineering, 2008. ISISE '08. International Symposium on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-2727-4
  • Type

    conf

  • DOI
    10.1109/ISISE.2008.256
  • Filename
    4732288