DocumentCode :
3277750
Title :
The research on Chinese document clustering based on WEKA
Author :
Han, Pu ; Wang, Dong-Bo ; Zhao, Qing-Guo
Author_Institution :
Dept. of Inf. Manage., Nanjing Univ., Nanjing, China
Volume :
4
fYear :
2011
fDate :
10-13 July 2011
Firstpage :
1953
Lastpage :
1957
Abstract :
This paper gives an experiment on Chinese document clustering based on WEKA. WEKA is an excellent open-source of data mining tool in abroad, but it is rarely used at home. We conducted the Chinese document clustering by K-means algorithm through adjusting the parameters in WEKA. Recall, precision and F-measure method are used to evaluate the experiment. We hope to provide a reference for researchers in this field.
Keywords :
Java; data mining; document handling; learning (artificial intelligence); pattern clustering; Chinese document clustering; F-measure method; K-means algorithm; WEKA; data mining tool; Clustering algorithms; Computational modeling; Feature extraction; Machine learning; Partitioning algorithms; Principal component analysis; Software algorithms; Document clustering; Document feature; Document representation; WEKA;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2011 International Conference on
Conference_Location :
Guilin
ISSN :
2160-133X
Print_ISBN :
978-1-4577-0305-8
Type :
conf
DOI :
10.1109/ICMLC.2011.6016955
Filename :
6016955
Link To Document :
بازگشت