DocumentCode
2551463
Title
The research of decision tree mining based on Hadoop
Author
Lu, Qiu ; Cheng, Xiao-hui
Author_Institution
Sch. of Inf. Sci. & Eng., Guilin Univ. of Technol., Guilin, China
fYear
2012
fDate
29-31 May 2012
Firstpage
798
Lastpage
801
Abstract
For a single node massive data, the mining calculation of the decision-tree is very large. In order to solve this problem, this paper proposes the HF_SPRINT parallel algorithm that bases on the Hadoop platform. The parallel algorithm optimizes and improves the SPRINT algorithm as well as realizes the parallelization. The experimental results show that this algorithm has high acceleration ratio and good scalability.
Keywords
data mining; decision trees; parallel algorithms; public domain software; HF_SPRINT parallel algorithm; Hadoop platform; data mining categorization; data mining technologies; decision tree mining calculation; distributed software framework; Acceleration; Algorithm design and analysis; Classification algorithms; Data mining; Educational institutions; Indexes; Parallel processing; Hadoop; MapReduce; SPRINT;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location
Sichuan
Print_ISBN
978-1-4673-0025-4
Type
conf
DOI
10.1109/FSKD.2012.6234264
Filename
6234264
Link To Document