• DocumentCode
    654086
  • Title

    A parallel algorithm to induce decision trees for large datasets

  • Author

    Franco-Arcega, A. ; Suarez-Cansino, J. ; Flores-Flores, L.G.

  • Author_Institution
    Inf. & Syst. Technol. Res. Center, Autonomous Univ. of the State of Hidalgo, Hidalgo, Mexico
  • fYear
    2013
  • fDate
    Oct. 30 2013-Nov. 1 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    This paper introduces a new parallel algorithm called ParDTLT and discusses some of its advantages with respect to a set of well known sequential and parallel algorithms. The parallel process occurs in every node in the decision tree, which is constructed during the supervised training phase. The basis of the distribution of a parallel task is on the attributes of the training objects and the growing of the tree is based on two criteria, who are defined by the maximum number of training objects that every node can support and an entropic gain ratio criterion. Different experiments have been made to compare the behavior of the parallel algorithm ParDTLT with the behavior of the sequential algorithms C4.5, VFDT, YaDT and DTLT and with the parallel algorithm called Synchronous. The experimental results show that ParDTLT keeps the quality of classification and it reduces the execution time.
  • Keywords
    database management systems; decision trees; entropy; parallel algorithms; C4.5 algorithms; DTLT algorithms; ParDTLT; Synchronous algorithm; VFDT algorithms; YaDT algorithms; decision trees; entropic gain ratio criterion; execution time; large datasets; parallel algorithm; parallel process; parallel task distribution; sequential algorithms; supervised training phase; Algorithm design and analysis; Decision trees; Parallel algorithms; Program processors; Time complexity; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information, Communication and Automation Technologies (ICAT), 2013 XXIV International Symposium on
  • Conference_Location
    Sarajevo
  • Type

    conf

  • DOI
    10.1109/ICAT.2013.6684045
  • Filename
    6684045