Title :
Text Categorization Research Based on Cluster Idea
Author :
Lin, Jialun ; Li, Xiaoling ; Jiao, Yuan
Author_Institution :
Comput. Teaching & Res. Sect., Hainan Med. Coll., Haikou, China
Abstract :
Classification and clustering are frequently-used methods in data excavation technology. This paper introduces the idea of text clustering into the categorization algorithm study. The authors also attempt to use the text categorization pattern of self´-initiated learning to design a clustering-based text categorization algorithm, in the purpose of reducing the dimension of training set and raising the efficiency of categorization implement. A series of experiments prove that this algorithm can greatly raise the efficiency while slightly reducing the accuracy of categorization, and then balance the contradiction between them.
Keywords :
pattern classification; pattern clustering; statistical analysis; text analysis; unsupervised learning; word processing; cluster idea; clustering-based text categorization algorithm; data excavation; self-initiated learning; text categorization pattern; text clustering; training set; Algorithm design and analysis; Clustering algorithms; Clustering methods; Computer science; Computer science education; Educational technology; Management training; Partitioning algorithms; Testing; Text categorization; K-Means algorithm; KNN algorithm; text categorization; text clustering;
Conference_Titel :
Education Technology and Computer Science (ETCS), 2010 Second International Workshop on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-6388-6
Electronic_ISBN :
978-1-4244-6389-3
DOI :
10.1109/ETCS.2010.413