DocumentCode :
327289
Title :
Parallel formulations of decision-tree classification algorithms
Author :
Srivastava, Anurag ; Han, Eui-Hong Sam ; Singh, Vineet ; Kumar, Vipin
Author_Institution :
Lab. of Inf. Technol., Hitachi America Ltd., Brisbane, CA, USA
fYear :
1998
fDate :
10-14 Aug 1998
Firstpage :
237
Lastpage :
244
Abstract :
Classification decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, etc. Highly parallel algorithms for constructing classification decision trees are desirable for dealing with large data sets in reasonable amount of time. Algorithms for building classification decision trees have a natural concurrency, but are difficult to parallelize due to the inherent dynamic nature of the computation. We present parallel formulations of classification decision tree learning algorithm based on induction. We describe two basic parallel formulations. One is based on Synchronous Tree Construction Approach and the other is based on Partitioned Tree Construction Approach. We discuss the advantages and disadvantages of using these methods and propose a hybrid method that employs the good features of these methods. Experimental results on an IBM SP-2 demonstrate excellent speedups and scalability
Keywords :
decision theory; deductive databases; knowledge acquisition; learning by example; parallel algorithms; pattern classification; tree data structures; trees (mathematics); IBM SP-2; Partitioned Tree Construction Approach; Synchronous Tree Construction Approach; classification decision tree algorithms; classification decision tree learning algorithm; data mining; decision tree classification algorithms; fraud detection; highly parallel algorithms; hybrid method; induction; inherent dynamic nature; large data sets; natural concurrency; parallel formulations; retail target marketing; Classification algorithms; Classification tree analysis; Computer science; Concurrent computing; Contracts; Data mining; Decision trees; Identity-based encryption; Information technology; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 1998. Proceedings. 1998 International Conference on
Conference_Location :
Minneapolis, MN
ISSN :
0190-3918
Print_ISBN :
0-8186-8650-2
Type :
conf
DOI :
10.1109/ICPP.1998.708491
Filename :
708491
Link To Document :
بازگشت