DocumentCode :
659567
Title :
GPU-accelerated incremental correlation clustering of large data with visual feedback
Author :
Papenhausen, Eric ; Bing Wang ; Sungsoo Ha ; Zelenyuk, Alla ; Imre, Dan ; Mueller, Klaus
Author_Institution :
Comput. Sci. Dept., Center for Visual Comput., Stony Brook Univ., Stony Brook, NY, USA
fYear :
2013
fDate :
6-9 Oct. 2013
Firstpage :
63
Lastpage :
70
Abstract :
Clustering is an important preparation step in big data processing. It may even be used to detect redundant data points as well as outliers. Elimination of redundant data and duplicates can serve as a viable means for data reduction and it can also aid in sampling. Visual feedback is very valuable here to give users confidence in this process. Furthermore, big data preprocessing is seldom interactive, which stands at conflict with users who seek answers immediately. The best one can do is incremental preprocessing in which partial and hopefully quite accurate results become available relatively quickly and are then refined over time. We propose a correlation clustering framework which uses MDS for layout and GPU-acceleration to accomplish these goals. Our domain application is the correlation clustering of atmospheric mass spectrum data with 8 million data points of 450 dimensions each.
Keywords :
data visualisation; graphics processing units; learning (artificial intelligence); pattern clustering; GPU-accelerated incremental correlation clustering; MDS; atmospheric mass spectrum data; big data preprocessing; big data processing; data reduction; graphics processing unit; large data clustering; multidimensional scaling; outliers detection; redundant data points detection; visual feedback; Clustering algorithms; Correlation; Data handling; Data storage systems; Graphics processing units; Information management; Instruction sets; GPU; big data; clustering; correlation; visual analytics; visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
Type :
conf
DOI :
10.1109/BigData.2013.6691716
Filename :
6691716
Link To Document :
بازگشت