DocumentCode
1761915
Title
A New Method for Data Stream Mining Based on the Misclassification Error
Author
Rutkowski, Leszek ; Jaworski, Maciej ; Pietruczuk, Lena ; Duda, Piotr
Author_Institution
Inst. of Comput. Intell., Czestochowa Univ. of Technol., Czestochowa, Poland
Volume
26
Issue
5
fYear
2015
fDate
42125
Firstpage
1048
Lastpage
1059
Abstract
In this paper, a new method for constructing decision trees for stream data is proposed. First a new splitting criterion based on the misclassification error is derived. A theorem is proven showing that the best attribute computed in considered node according to the available data sample is the same, with some high probability, as the attribute derived from the whole infinite data stream. Next this result is combined with the splitting criterion based on the Gini index. It is shown that such combination provides the highest accuracy among all studied algorithms.
Keywords
data mining; decision trees; pattern classification; Gini index; data stream mining; decision tree; misclassification error; splitting criterion; Accuracy; Data mining; Decision trees; Gaussian distribution; Impurities; Indexes; Silicon; Classification; data stream; decision trees; impurity measure; splitting criterion; splitting criterion.;
fLanguage
English
Journal_Title
Neural Networks and Learning Systems, IEEE Transactions on
Publisher
ieee
ISSN
2162-237X
Type
jour
DOI
10.1109/TNNLS.2014.2333557
Filename
6857351
Link To Document