مرکز منطقه ای اطلاع رساني علوم و فناوري - Research on text clustering algorithm based on improved K-means

DocumentCode :

3415778

Title :

Research on text clustering algorithm based on improved K-means

Author :

Xinwu, Li

Author_Institution :

Electron. Bus. Dept., Jiangxi Univ. of Finance & Econ., Nanchang, China

Volume :

fYear :

2010

fDate :

25-27 June 2010

Abstract :

Text clustering is one of the difficult and hot research fields in the internet search engine research. Using the advantages of K-means clustering and overcoming its disadvantages, a new text clustering algorithm is presented. Firstly, texts are preprocessed to satisfy succeed process. Then, the paper analyzes common K-means clustering algorithm and improves the algorithm principle K-means and corrects its cluster seed selection method of to overcome efficiency of low stability of K-means algorithm which is very sensitive to the initial cluster center and the isolated point text. The experimental results indicate that the improved algorithm has a higher accuracy and has a better stability, compared with the original algorithm.

Keywords :

Internet; pattern clustering; search engines; text analysis; Internet search engine; K-means clustering algorithm; cluster seed selection method; text clustering algorithm; Algorithm design and analysis; Clustering algorithms; Finance; IP networks; Information retrieval; Internet; Iterative algorithms; Partitioning algorithms; Search engines; Stability analysis; K-means; Text clustering; cluster seed selection;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Design and Applications (ICCDA), 2010 International Conference on

Conference_Location :

Qinhuangdao

Print_ISBN :

978-1-4244-7164-5

Electronic_ISBN :

978-1-4244-7164-5

Type :

conf

DOI :

10.1109/ICCDA.2010.5540727

Filename :

5540727

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3415778