Title :
The Capability Analysis on the Characteristic Selection Algorithm of Text Categorization Based on F1 Measure Value
Author :
He Shaojun ; Cao Jin ; Guo Ruixu ; Wang Guijun
Author_Institution :
Northern Electron. Instrum. Inst., Beijing, China
Abstract :
The text categorization is an important aspect in the processing of nature languages. It can be used to identify the categorization information within the nature languages, consequently, the clutter problem, directional detection and scout of information has been solved. The general processing of text categorization is proposed in this paper. Taken Sogou datasets as the target, the capability of several typical characteristic selection algorithms have been analyzed in KNN classification machine with different characteristic dimensions and classification methods, while the text categorization experiment is based on F1 measure value.
Keywords :
natural language processing; pattern classification; text analysis; F1 measure value; KNN classification machine; Sogou datasets; capability analysis; categorization information; characteristic dimensions; characteristic selection algorithm; classification methods; directional detection; nature language processing; text categorization; text categorization experiment; Algorithm design and analysis; Classification algorithms; Computational modeling; Data models; Internet; Support vector machines; Text categorization; Characteristic Dimension; Characteristic Selection; F1 measure value; KNN Classification Machine; text categorization;
Conference_Titel :
Instrumentation, Measurement, Computer, Communication and Control (IMCCC), 2012 Second International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4673-5034-1
DOI :
10.1109/IMCCC.2012.180