DocumentCode
2143150
Title
A text clustering algorithm based on category resolve power
Author
Zhou, Faguo ; Zhang, Fan ; Yang, Bingru
Author_Institution
School of Mechanical Electronic & Information Engineering, China University of Mining & Technology, Beijing, China
fYear
2010
fDate
4-6 Dec. 2010
Firstpage
3494
Lastpage
3497
Abstract
As an unsupervised machine learning technology, document clustering has been widely used in many fields, such as Information Retrieval (IR) and Text Categorization (TC). But, because of the bag of words used in document clustering as document index, the feature space of corpus must be high dimension space. This problem makes a negative effect to the efficiency and precision of text clustering. Based on category resolve power, a new feature selection function is constructed. Through integration between this function and document clustering algorithm, a high-powered text clustering algorithm is presented. Experiments on a universal corpus show that it has a good performance.
Keywords
Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Machine learning; Partitioning algorithms; Text categorization; dimension reduction; feature selection; result evaluaton; text clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location
Hangzhou, China
Print_ISBN
978-1-4244-7616-9
Type
conf
DOI
10.1109/ICISE.2010.5690994
Filename
5690994
Link To Document