• DocumentCode
    2849835
  • Title

    Communication efficient construction of decision trees over heterogeneously distributed data

  • Author

    Giannella, Chris ; Liu, Kun ; Olsen, Todd ; Kargupta, Hillol

  • Author_Institution
    Dept. of Comput. Sci. & Electr. Eng., Maryland Univ., Baltimore, MD, USA
  • fYear
    2004
  • fDate
    1-4 Nov. 2004
  • Firstpage
    67
  • Lastpage
    74
  • Abstract
    We present an algorithm designed to efficiently construct a decision tree over heterogeneously distributed data without centralizing. We compare our algorithm against a standard centralized decision tree implementation in terms of accuracy as well as the communication complexity. Our experimental results show that by using only 20% of the communication cost necessary to centralize the data we can achieve trees with accuracy at least 80% of the trees produced by the centralized version.
  • Keywords
    communication complexity; data mining; decision trees; distributed processing; communication complexity; communication efficient construction; decision trees; distributed data mining; heterogeneously distributed data; random projection; Algorithm design and analysis; Communication channels; Communication standards; Complexity theory; Computer science; Costs; Data mining; Decision trees; Distributed decision making; Message passing; Decision Trees; Distributed Data Mining; Random Projection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
  • Print_ISBN
    0-7695-2142-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2004.10114
  • Filename
    1410268