Title :
A parallel decision tree builder for mining very large visualization datasets
Author :
Bowyer, K.W. ; Hall, L.O. ; Moore, T. ; Chawla, N. ; Kegelmeyer, W.P.
Author_Institution :
Univ. of South Florida, Tampa, FL, USA
Abstract :
Simulation problems in the DOE ASCI program generate visualization datasets more than a terabyte in size. The practical difficulties in visualizing such datasets motivate the desire for automatic recognition of salient events. We have developed a parallel decision tree classifier for use in this context. Comparisons to ScalParC, a previous attempt to build a fast parallelization of a decision tree classifier, are provided. Our parallel classifier executes on the "ASCI Red" supercomputer. Experiments demonstrate that datasets too large to be processed on a single processor can be efficiently handled in parallel, and suggest that there need not be any decrease in accuracy relative to a monolithic classifier constructed on a single processor.
Keywords :
data mining; data visualisation; decision trees; digital simulation; parallel processing; pattern classification; physics computing; ASCI Red supercomputer; DOE ASCI program; ScalParC; automatic event recognition; decision tree classifier; fast parallelization; parallel decision tree builder; simulation problems; very large visualization dataset mining; Acceleration; Classification tree analysis; Concurrent computing; Data visualization; Decision trees; Physics computing; Supercomputers; Testing; Training data; US Department of Energy;
Conference_Titel :
Systems, Man, and Cybernetics, 2000 IEEE International Conference on
Conference_Location :
Nashville, TN
Print_ISBN :
0-7803-6583-6
DOI :
10.1109/ICSMC.2000.886388