DocumentCode
1571955
Title
An Improved Condensing Algorithm
Author
Hao, Xiulan ; Zhang, Chenghong ; Xu, Hexiang ; Tao, Xiaopeng ; Wang, Shuyun ; Hu, Yunfa
Author_Institution
Dept. of CIT, Fudan Univ., Shanghai
fYear
2008
Firstpage
316
Lastpage
321
Abstract
kNN classifier is widely used in text categorization, however, kNN has the large computational and store requirements, and its performance also suffers from uneven distribution of training data. Usually, condensing technique is resorted to reducing the noises of training data and decreasing the cost of time and space. Traditional condensing technique picks up samples in a random manner when initialization. Though random sampling is one means to reduce outliers, the extremely stochastic may lead to bad performance sometimes, that is, advantages of sampling may be suppressed. To avoid such a misfortune, we propose a variation of traditional condensing technique. Experiment results illustrate this strategy can solve above problems effectively.
Keywords
classification; learning (artificial intelligence); neural nets; sampling methods; text analysis; condensing algorithm; kNN classifier; outlier reduction; random sampling; text categorization; training data; Conference management; Costs; Distributed computing; Electronic mail; Information science; Management training; Noise reduction; Sampling methods; Text categorization; Training data; Condensing Algorithm; Selected Seeds; Text Categorization; kNN;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on
Conference_Location
Portland, OR
Print_ISBN
978-0-7695-3131-1
Type
conf
DOI
10.1109/ICIS.2008.67
Filename
4529839
Link To Document