Title :
Reducing Samples Learning for Text Categorization
Author :
Zhan, Yan ; Chen, Hao
Author_Institution :
Coll. of Math. & Comput. Sci., Hebei Univ., Baoding, China
Abstract :
Text Categorization (TC) is an important component in many information organization and information management tasks. In Text Categorization question there will be too many instances which need much computation time and memory requirement. It proposes a Generalization Capability (GC) algorithm that has the highest average generalization accuracy in these experiments, especially in the presence of uniform class noise. It also compared GC algorithm with existing reducing samples algorithms such as Condensed Nearest Neighbor, Selective Nearest Neighbor, Reduced Nearest Neighbor Rule, Edited Nearest Neighbor Rule in Text Categorization.
Keywords :
classification; set theory; text analysis; generalization capability algorithm; information management; information organization; k-nearest neighbor algorithm; text categorization; Classification; K-NN; Reducing samples; Text Categorization;
Conference_Titel :
Information Management, Innovation Management and Industrial Engineering (ICIII), 2010 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-8829-2
DOI :
10.1109/ICIII.2010.307