DocumentCode :
3277871
Title :
A scalable random forest algorithm based on MapReduce
Author :
Jiawei Han ; Yanheng Liu ; Xin Sun
Author_Institution :
Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun, China
fYear :
2013
fDate :
23-25 May 2013
Firstpage :
849
Lastpage :
852
Abstract :
Random Forest is a popular data classification algorithm for machine learning. This paper proposes SMRF algorithm--an improved scalable Random Forest algorithm based on Map Reduce model. This new algorithm makes data classification in computer cluster or cloud computing environment for massive datasets. SMRF processes and optimizes the subsets of the data across multiple participating computing nodes by distributing. The experimental results show that the SMRF algorithm has the equally accuracy degradation but higher performance while comparing with traditional Random Forest algorithm. SMRF algorithm is more suitable to classify massive data sets in distributing computing environment than traditional Random Forest algorithm.
Keywords :
cloud computing; learning (artificial intelligence); pattern classification; MapReduce; SMRF algorithm; cloud computing environment; computer cluster; computing nodes distribution; data classification algorithm; data subsets; machine learning; massive datasets; scalable random forest algorithm; Classification algorithms; Computational modeling; DNA; Estimation; Programming; Radio frequency; Synthetic aperture sonar; Map-Reduce; Random Forest algorithm; data classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on
Conference_Location :
Beijing
ISSN :
2327-0586
Print_ISBN :
978-1-4673-4997-0
Type :
conf
DOI :
10.1109/ICSESS.2013.6615438
Filename :
6615438
Link To Document :
بازگشت