Title :
A K-Nearest Neighbor Algorithm based on cluster in text classification
Author :
Wang, Chun-Yan ; Yan, Yu-Guang ; Zhang, Kuo ; Li, Jian-Gang
Author_Institution :
Dept. of Comput. Sci. & Technol., Changchun Normal Coll., Changchun, China
Abstract :
The K-Nearest Neighbor Algorithm (K-NN) is an important approach for automatic text classification. In this paper, cluster was applied In order to overcome the disadvantages of the traditional K-NN algorithm. First Clustering was utilized in training set through an improved K-mean approach to select the most representative samples as cluster center. Then we compute the comparability between the testing samples and the central vector of each cluster. A K-NN algorithm based on cluster was presented. The experiment results verify that this classification algorithm is much faster than the traditional K-NN algorithm, and it can raise the accuracy.
Keywords :
pattern classification; pattern clustering; text analysis; automatic text classification; cluster center; k-means approach; k-nearest neighbor algorithm; training set; Artificial neural networks; Biological system modeling; cluster; k-Nearest Neighbor; text classification;
Conference_Titel :
Computer, Mechatronics, Control and Electronic Engineering (CMCE), 2010 International Conference on
Conference_Location :
Changchun
Print_ISBN :
978-1-4244-7957-3
DOI :
10.1109/CMCE.2010.5610477