DocumentCode
480607
Title
Clustering Description Extraction Based on Statistical Machine Learning
Author
Zhang, Chengzhi ; Xu, Hongjiao
Author_Institution
Dept. of Inf. Manage., Nanjing Univ. of Sci. & Technol., Nanjing
Volume
2
fYear
2008
fDate
20-22 Dec. 2008
Firstpage
22
Lastpage
26
Abstract
Clustering description problem is one of key issues of the traditional document clustering algorithm. The traditional document algorithm can cluster the objects, but it can not give concept description for the clustered results. Document clustering description is a problem of labeling the clustered results of document collection clustering. It can help users determine whether one of the clusters is relevant to users´ information requirement. Therefore, labeling a clustered set of documents is an important and challenging work in document clustering applications. To resolve the problem of the weak readability of the traditional document clustering results, a method of automatic labeling documents clusters based on machine learning is put forward. Experimental results show that the method based on SVM will provide users with more concise and comprehensive document clustering results. It also reflects the linear trend of clustering description problem.
Keywords
document handling; learning (artificial intelligence); pattern clustering; statistical analysis; clustered document labeling; document clustering description extraction; document collection clustering; statistical machine learning; Clustering algorithms; Data mining; Frequency; Information management; Information technology; Labeling; Learning systems; Machine learning; Machine learning algorithms; Support vector machines; Clustering Description; Document Clustering; Statistical Machine Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on
Conference_Location
Shanghai
Print_ISBN
978-0-7695-3497-8
Type
conf
DOI
10.1109/IITA.2008.114
Filename
4739719
Link To Document