Title :
A Multilayer Method of Text Feature Extraction Based on CILIN
Author :
Li, Xin-fu ; Zhao, Lei-lei
Author_Institution :
Fac. of Math. & Comput., Hebei Univ., Baoding
fDate :
Aug. 29 2008-Sept. 2 2008
Abstract :
The feature extraction is the most critical technology of text categorization. The method of feature extraction from Chinese text based on CILIN is different from the conventional feature extraction, which uses two feature extraction methods. This method is good at dealing with synonyms and polysemes, and reducing the dimension. Firstly, it uses the method of feature extraction from Chinese text based on CILIN to analyze the meaning of key words. Secondly, use the mutual information to extract the feature, it can give the relation between class and lemma. The experiment results proposed that comprehend to the meaning of key words can distinctively improve the text classification precision.
Keywords :
feature extraction; text analysis; CILIN; Chinese text; multilayer method; text categorization; text classification precision; text feature extraction; Feature extraction; Frequency; Mathematics; Mutual information; Niobium; Nonhomogeneous media; Statistics; Support vector machine classification; Support vector machines; Text categorization; CILIN; feature extraction; text categorization;
Conference_Titel :
Computer Science and Information Technology, 2008. ICCSIT '08. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-0-7695-3308-7
DOI :
10.1109/ICCSIT.2008.57