Title :
A scalable and dynamic self-organizing map for clustering large volumes of text data
Author :
Matharage, Sumith ; Ganegedara, Hiran ; Alahakoon, D.
Abstract :
Self Organizing Map (SOM) and Growing Self Organizing Map (GSOM) are widely used techniques for text mining. Mining large text data sets is significantly processor intensive [1]. Recently Fast Growing Self Organizing Map (FastGSOM) was proposed an improvement to the GSOM for clustering text data more efficiently [2]. For text corpuses with thousands of documents, the time requirement could still be a bottleneck with high turnaround times for the analysis process. We propose a new scalable parallel algorithm for text analysis using FastGSOM which can harness the power of parallel and distributed computing for efficient analysis of large scale text datasets. We demonstrate that the proposed algorithm has similar or better accuracy compared to GSOM and is several orders more efficient when operating in parallel.
Keywords :
data mining; parallel algorithms; pattern clustering; self-organising feature maps; text analysis; FastGSOM; analysis process; distributed computing; growing self organizing map; parallel algorithm; parallel computing; processor intensive; self-organizing map; text analysis; text corpuses; text data clustering; text datasets; text mining; Clustering algorithms; Indexes; Neurons; Partitioning algorithms; Topology; Training; Vectors;
Conference_Titel :
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4673-6128-6
DOI :
10.1109/IJCNN.2013.6706733