DocumentCode :
2746973
Title :
Research on K-means Text Clustering Algorithm Based on Semantic
Author :
Liu, Yufang ; Xiao, Shibin ; Lv, Xueqiang ; Shi, Shuicai
Author_Institution :
Chinese Inf. Process. Res. Center, Beijing Inf. Sci. & Technol. Univ., Beijing, China
Volume :
1
fYear :
2010
fDate :
5-6 June 2010
Firstpage :
124
Lastpage :
127
Abstract :
Through research on K-means algorithm of text clustering and semantic-based vector space model, a semantic-based K-means text clustering model is proposed to solve the problem on high-dimensional and sparse characteristics of text data set. The model reduces the semantic loss of the text data and improves the quality of text clustering. Experiments prove that semantic-based text clustering increases by more 6 percent than non-semantic-based one in the final evaluation of the F1 index value.
Keywords :
pattern clustering; text analysis; F1 index value; k-means clustering; semantic-based vector space model; text clustering algorithm; Clustering algorithms; Filtering; Industrial engineering; Information processing; Information science; Information technology; Mathematical model; Optical noise; Partitioning algorithms; Space technology; HowNet; K-means algorithm; Term Contribution; semantic similarity; text vector;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing, Control and Industrial Engineering (CCIE), 2010 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-4026-9
Type :
conf
DOI :
10.1109/CCIE.2010.39
Filename :
5492048
Link To Document :
بازگشت