DocumentCode :
1921793
Title :
A Novel Self-Organizing Map for Text Document Organization
Author :
Yang, Hsin-Chang ; Lee, Chung-Hong
Author_Institution :
Dept. Inf. Manage., Nat. Univ. of Kaohsiung, Kaohsiung, Taiwan
fYear :
2012
fDate :
26-28 Sept. 2012
Firstpage :
39
Lastpage :
44
Abstract :
The self-organizing map (SOM) model is a well-known neural network model with wide spread of applications. The main characteristics of SOM are two-fold, namely dimension reduction and topology preservation. Using SOM, a high-dimensional data space will be mapped to some low-dimensional space. Meanwhile, the topological relations among data will be preserved. With such characteristics, the SOM was usually applied on data clustering and visualization tasks. One major shortage of classical SOM learning algorithm is the necessity of predefined map topology. Furthermore, hierarchical relationships among data are also difficult to be revealed. In this work, we propose a novel SOM learning algorithm which incorporates several text mining techniques in expanding the map both laterally and hierarchically that could discover the relationships among documents in both perspectives. The proposed algorithm will first cluster a set of training documents using classical SOM algorithm. We then identify the topics of each cluster and use them to evaluate the criteria on expanding the map. We applied the algorithm on medium-size datasets and obtained promising result.
Keywords :
data mining; data visualisation; learning (artificial intelligence); pattern clustering; self-organising feature maps; text analysis; topology; SOM model; classical SOM learning algorithm; data clustering; data visualization; dimension reduction; high-dimensional data space; medium-size datasets; novel SOM learning algorithm; novel self-organizing map; text document organization; text mining techniques; topological relations; topology preservation; training documents; well-known neural network model; Clustering algorithms; Neural networks; Neurons; Text categorization; Topology; Training; Vectors; Hierarchy Generation; Self-organizing Map; Text Mining; Topic Identification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovations in Bio-Inspired Computing and Applications (IBICA), 2012 Third International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-1-4673-2838-8
Type :
conf
DOI :
10.1109/IBICA.2012.53
Filename :
6337634
Link To Document :
بازگشت