Title :
Improving the k-NN and applying it to Chinese text classification
Author :
Yuan, Fang ; Yang, Liu ; Yu, Ge
Author_Institution :
Coll. of Math. & Comput. Sci., Hebei Univ., China
Abstract :
With the problems of applying k-NN to Chinese text classification, this paper gives some improvements on k-NN. Word segmentation based on dictionaries and statistics can increase the accuracy of the classification and reduce the number of dimensions. Applying genetic algorithm to learn the value of k can improve classification automatization. The gradual classification mode is good for improving classification efficiency. The experiment shows that those improvements on k-NN can improve the efficiency of Chinese text classification while maintain the higher accuracy.
Keywords :
classification; genetic algorithms; text analysis; Chinese text classification; classification automatization; genetic algorithm; k-nearest neighbor; word segmentation; Computer science; Educational institutions; Electronic mail; Genetic algorithms; Information science; Internet; Mathematics; Statistics; Testing; Text categorization; Chinese text classification; genetic algorithm; gradual classification mode; k-Nearest Neighbor method; text preprocessing;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527190