DocumentCode :
3318213
Title :
Learning effective features for Chinese text categorization
Author :
Luo, Dingsheng ; Wang, Xinhao ; Wu, Xihong ; Chi, Huisheng
Author_Institution :
Nat. Lab. on Machine Perception, Peking Univ., Beijing, China
fYear :
2005
fDate :
30 Oct.-1 Nov. 2005
Firstpage :
608
Lastpage :
613
Abstract :
Text categorization task always suffers from a high dimension problem, which leads the learning system to be in a status of either lower efficiency or lower performance. A number of feature selection methods have therefore been adopted or proposed for its dimensional reduction, such as DF, IG, Chi Square and so on. Unlike those traditional feature selection methods, in this paper, a feature selection method based on the idea of "discriminative learning" is presented, where those learned "effective" features rather than traditional "important" features are used to construct feature space. During learning effective features, a variant AdaBoost algorithm as well as a pairwise multiclass learning scheme are adopted. Simulation results show the presented method works well.
Keywords :
classification; feature extraction; learning (artificial intelligence); text analysis; Chinese text categorization; dimensional reduction; discriminative learning; feature selection methods; pairwise multiclass learning scheme; variant AdaBoost algorithm; Bayesian methods; Classification tree analysis; Dictionaries; Feature extraction; Frequency; Information management; Learning systems; Machine learning; Nearest neighbor searches; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
Type :
conf
DOI :
10.1109/NLPKE.2005.1598809
Filename :
1598809
Link To Document :
بازگشت