• DocumentCode
    2551463
  • Title

    The research of decision tree mining based on Hadoop

  • Author

    Lu, Qiu ; Cheng, Xiao-hui

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Guilin Univ. of Technol., Guilin, China
  • fYear
    2012
  • fDate
    29-31 May 2012
  • Firstpage
    798
  • Lastpage
    801
  • Abstract
    For a single node massive data, the mining calculation of the decision-tree is very large. In order to solve this problem, this paper proposes the HF_SPRINT parallel algorithm that bases on the Hadoop platform. The parallel algorithm optimizes and improves the SPRINT algorithm as well as realizes the parallelization. The experimental results show that this algorithm has high acceleration ratio and good scalability.
  • Keywords
    data mining; decision trees; parallel algorithms; public domain software; HF_SPRINT parallel algorithm; Hadoop platform; data mining categorization; data mining technologies; decision tree mining calculation; distributed software framework; Acceleration; Algorithm design and analysis; Classification algorithms; Data mining; Educational institutions; Indexes; Parallel processing; Hadoop; MapReduce; SPRINT;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
  • Conference_Location
    Sichuan
  • Print_ISBN
    978-1-4673-0025-4
  • Type

    conf

  • DOI
    10.1109/FSKD.2012.6234264
  • Filename
    6234264