DocumentCode
2497985
Title
Noise reduction to text categorization based on density for KNN
Author
Li, Rong-lu ; Hu, Yun-fa
Author_Institution
Comput. Technol. & Inf. Dept., Fudan Univ., Shanghai, China
Volume
5
fYear
2003
fDate
2-5 Nov. 2003
Firstpage
3119
Abstract
With the rapid development of World Wide Web, text classification has become the key technology in organizing and processing large amount of document data. As a simple and effective classification approach, KNN method is widely used in text categorization. But KNN classifier not only has the large computational demands, but also may result in the decrease of precision of classification because of uneven density of training data. In this paper, we present a density-based method for reducing the noises of training data, which solves these problems. Our experiment results also illustrate it.
Keywords
classification; information retrieval; learning (artificial intelligence); text analysis; KNN classifier; KNN method; World Wide Web; density based method; document data; k-nearest neighbor classifier; noise reduction; text categorization; text classification; training data; Artificial intelligence; Electronic mail; Machine learning; Natural languages; Noise reduction; Organizing; Runtime; Text categorization; Training data; Web sites;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN
0-7803-8131-9
Type
conf
DOI
10.1109/ICMLC.2003.1260115
Filename
1260115
Link To Document