Title :
Parallel Data Mining on Multicore Clusters
Author :
Qiu, Xiaohong ; Fox, Geoffrey ; Yuan, Huapeng ; Bae, Seung-Hee ; Chrysanthakopoulos, George ; Nielsen, Henrik
Author_Institution :
UITS, Indiana Univ., Bloomington, IN
Abstract :
The ever increasing number of cores per chip will be accompanied by a pervasive data deluge whose size will probably increase even faster than CPU core count over the next few years. This suggests the importance of parallel data analysis and data mining applications with good multicore, cluster and grid performance. This paper considers data clustering, mixture models and dimensional reduction presenting a unified framework applicable to bioinformatics, cheminformatics and demographics. Deterministic annealing is used to lessen effect of local minima. We present performance results on clusters of 2-8 core systems identifying effects from cache, runtime fluctuations, synchronization and memory bandwidth. We discuss needed programming model and compare with MPI and other approaches.
Keywords :
data analysis; data mining; parallel processing; pattern clustering; data clustering; deterministic annealing; dimensional reduction; mixture model; multicore clusters; parallel data analysis; parallel data mining; Bandwidth; Concurrent computing; Data analysis; Data mining; Grid computing; Multicore processing; Parallel processing; Parallel programming; Pervasive computing; Runtime; CCR; MPI; applications; cache; clusters; data mining; multicore; parallel; performance;
Conference_Titel :
Grid and Cooperative Computing, 2008. GCC '08. Seventh International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-0-7695-3449-7
DOI :
10.1109/GCC.2008.100