Title :
A text clustering algorithm based on category resolve power
Author :
Zhou, Faguo ; Zhang, Fan ; Yang, Bingru
Author_Institution :
School of Mechanical Electronic & Information Engineering, China University of Mining & Technology, Beijing, China
Abstract :
As an unsupervised machine learning technology, document clustering has been widely used in many fields, such as Information Retrieval (IR) and Text Categorization (TC). But, because of the bag of words used in document clustering as document index, the feature space of corpus must be high dimension space. This problem makes a negative effect to the efficiency and precision of text clustering. Based on category resolve power, a new feature selection function is constructed. Through integration between this function and document clustering algorithm, a high-powered text clustering algorithm is presented. Experiments on a universal corpus show that it has a good performance.
Keywords :
Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Machine learning; Partitioning algorithms; Text categorization; dimension reduction; feature selection; result evaluaton; text clustering;
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
DOI :
10.1109/ICISE.2010.5690994