DocumentCode :
3010619
Title :
A SVM-Based Text Classification Method with SSK-Means Clustering Algorithm
Author :
Yan, Hongcan ; Lin, Chen ; Li, Bicheng
Author_Institution :
Zhengzhou Inf. Technol. Inst., Zhengzhou, China
Volume :
2
fYear :
2009
fDate :
7-8 Nov. 2009
Firstpage :
379
Lastpage :
383
Abstract :
SVM-based classification needs lots of labeled data to train classifier model, but labeling training dataset is a time-wasting and energy-wasting task. Furthermore, the feature space is sparse commonly because of text´s high dimension. All of the factors above can influence the performance of classification. We propose a SVM-based text classification with SSK-means clustering algorithm where little labeled training data are needed. In this approach, training data, including both labeled and unlabeled data, are first clustered with guidance of the labeled data. The unlabeled data samples are then labeled based on the clusters obtained. SVM classifiers can be trained with the expanded training dataset. When the training dataset has only a little labeled data, this method has better performance than SVM classifiers.
Keywords :
support vector machines; text analysis; SSK-means clustering algorithm; SVM-based text classification method; training data; Artificial intelligence; Classification algorithms; Clustering algorithms; Information technology; Partitioning algorithms; Support vector machine classification; Support vector machines; Testing; Text categorization; Training data; SSK-means clustering algorithm; SVM classification; labeled data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Artificial Intelligence and Computational Intelligence, 2009. AICI '09. International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-3835-8
Electronic_ISBN :
978-0-7695-3816-7
Type :
conf
DOI :
10.1109/AICI.2009.446
Filename :
5375806
Link To Document :
بازگشت