DocumentCode :
3436753
Title :
Streaming Random Forests
Author :
Abdulsalam, Hanady ; Skillicorn, David B. ; Martin, Patrick
Author_Institution :
Queen´´s Univ. Kingston, Kingston
fYear :
2007
fDate :
6-8 Sept. 2007
Firstpage :
225
Lastpage :
232
Abstract :
Many recent applications deal with data streams, conceptually endless sequences of data records, often arriving at high flow rates. Standard data-mining techniques typically assume that records can be accessed multiple times and so do not naturally extend to streaming data. Algorithms for mining streams must be able to extract all necessary information from records with only one, or perhaps a few, passes over the data. We present the streaming random forests algorithm, an online and incremental stream classification algorithm that extends Breiman´s random forests algorithm. The streaming random forests algorithm grows multiple decision trees, and classifies unlabeled records based on the plurality of tree votes. We evaluate the classification accuracy of the streaming random forests algorithm on several datasets, and show that its accuracy is comparable to the standard random forest algorithm.
Keywords :
classification; data mining; decision trees; data mining; data streams; multiple decision trees; random forest streaming; stream classification; Approximation algorithms; Classification tree analysis; Clustering algorithms; Data flow computing; Data mining; Decision trees; Loans and mortgages; Predictive models; Robustness; Testing; Classification; Data mining; Data-stream classification; Decision trees; Random Forests.;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database Engineering and Applications Symposium, 2007. IDEAS 2007. 11th International
Conference_Location :
Banff, Alta.
ISSN :
1098-8068
Print_ISBN :
978-0-7695-2947-9
Type :
conf
DOI :
10.1109/IDEAS.2007.4318108
Filename :
4318108
Link To Document :
بازگشت