DocumentCode
3010619
Title
A SVM-Based Text Classification Method with SSK-Means Clustering Algorithm
Author
Yan, Hongcan ; Lin, Chen ; Li, Bicheng
Author_Institution
Zhengzhou Inf. Technol. Inst., Zhengzhou, China
Volume
2
fYear
2009
fDate
7-8 Nov. 2009
Firstpage
379
Lastpage
383
Abstract
SVM-based classification needs lots of labeled data to train classifier model, but labeling training dataset is a time-wasting and energy-wasting task. Furthermore, the feature space is sparse commonly because of text´s high dimension. All of the factors above can influence the performance of classification. We propose a SVM-based text classification with SSK-means clustering algorithm where little labeled training data are needed. In this approach, training data, including both labeled and unlabeled data, are first clustered with guidance of the labeled data. The unlabeled data samples are then labeled based on the clusters obtained. SVM classifiers can be trained with the expanded training dataset. When the training dataset has only a little labeled data, this method has better performance than SVM classifiers.
Keywords
support vector machines; text analysis; SSK-means clustering algorithm; SVM-based text classification method; training data; Artificial intelligence; Classification algorithms; Clustering algorithms; Information technology; Partitioning algorithms; Support vector machine classification; Support vector machines; Testing; Text categorization; Training data; SSK-means clustering algorithm; SVM classification; labeled data;
fLanguage
English
Publisher
ieee
Conference_Titel
Artificial Intelligence and Computational Intelligence, 2009. AICI '09. International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-3835-8
Electronic_ISBN
978-0-7695-3816-7
Type
conf
DOI
10.1109/AICI.2009.446
Filename
5375806
Link To Document