مرکز منطقه ای اطلاع رساني علوم و فناوري - The research on Chinese document clustering based on WEKA

DocumentCode :

3277750

Title :

The research on Chinese document clustering based on WEKA

Author :

Han, Pu ; Wang, Dong-Bo ; Zhao, Qing-Guo

Author_Institution :

Dept. of Inf. Manage., Nanjing Univ., Nanjing, China

Volume :

fYear :

2011

fDate :

10-13 July 2011

Firstpage :

1953

Lastpage :

1957

Abstract :

This paper gives an experiment on Chinese document clustering based on WEKA. WEKA is an excellent open-source of data mining tool in abroad, but it is rarely used at home. We conducted the Chinese document clustering by K-means algorithm through adjusting the parameters in WEKA. Recall, precision and F-measure method are used to evaluate the experiment. We hope to provide a reference for researchers in this field.

Keywords :

Java; data mining; document handling; learning (artificial intelligence); pattern clustering; Chinese document clustering; F-measure method; K-means algorithm; WEKA; data mining tool; Clustering algorithms; Computational modeling; Feature extraction; Machine learning; Partitioning algorithms; Principal component analysis; Software algorithms; Document clustering; Document feature; Document representation; WEKA;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Cybernetics (ICMLC), 2011 International Conference on

Conference_Location :

Guilin

ISSN :

2160-133X

Print_ISBN :

978-1-4577-0305-8

Type :

conf

DOI :

10.1109/ICMLC.2011.6016955

Filename :

6016955

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3277750