Title :
Exploring the Effects of Text Clustering on On-Line Military News Based on Quantitative Association Rule
Author :
Chen, Liang-Chu ; Yang, Chyi-Bao ; Chen, Jih-Hsin ; Lien, Yen-Hsuan
Author_Institution :
Dept. of Comput. Sci., Nat. Defense Univ., Taoyuan, Taiwan
Abstract :
Text clustering is an automatic technique to group texts using the approach of feature extraction and term connection to calculate the similarities among subject contents of texts. Since the properties of terms in Chinese text (e.g. segmentation and annotation) are not as clear as the other languages, extracting and distinguishing features from Chinese text is therefore much more difficult, which greatly impacts the effects of clustering. From the perspective of military news, this paper applies both quantitative association rule and hierarchical agglomerative algorithm to cluster Chinese news published in Youth Daily News, and the application results are compared with those by the traditional vector space model approach and by the general association rule approach, respectively. F-measure is used as evaluation metric in the experiments. Experimental results show that the quantitative association rule approach performs more accurately than both the vector space model and association rule in text automatic clustering.
Keywords :
data mining; feature extraction; military computing; natural language processing; text analysis; Chinese text; F-measure; Youth Daily News; evaluation metric; feature extraction; hierarchical agglomerative algorithm; online military news; quantitative association rule; quantitative association rule approach; text clustering; vector space model; Association rules; Conference management; Content management; Data mining; Data visualization; Feature extraction; Frequency; Information management; Military computing; Text mining; Association rule; Hierarchical agglomerative clustering; Quantitative association rule; Vector space model;
Conference_Titel :
Asian Language Processing, 2009. IALP '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-0-7695-3904-1
DOI :
10.1109/IALP.2009.48