DocumentCode :
424107
Title :
Applying class triggers in Chinese POS tagging based on maximum entropy model
Author :
Zhao, Yan ; Wang, Xjao-Long ; Liu, Bing-quan ; Guan, Yi
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., China
Volume :
3
fYear :
2004
fDate :
26-29 Aug. 2004
Firstpage :
1641
Abstract :
A method of applying class triggers in Chinese POS tagging based on maximum entropy model is proposed in this paper. First of all, feature template of "word-> word/tag" is used to extract the triggers from corpus and the triggers that we extracted are added into the maximum entropy model as a new kind of feature. Then, the average mutual information is applied to make feature selection and the semantic lexicon is used to build class triggers to overcome sparseness problem. Meanwhile, a solution based on experience to deal with over-fitting problem in model training is presented. Finally, the performance of the system is evaluated on a manually annotated POS tag corpus. The experiment demonstrates that the method can provide increase of accuracy of POS tagging from 94% to 96%, compared our new model with HMM model that is smoothed by absolute smoothing.
Keywords :
feature extraction; grammars; hidden Markov models; maximum entropy methods; natural languages; speech processing; Chinese POS tagging; Chinese part of speech tagging; HMM; POS tag corpus; average mutual information; class trigger method; feature selection; feature template; maximum entropy model; model training; overfitting problem; semantic lexicon; sparseness problem; trigger extraction; Computer science; Data mining; Electronic mail; Entropy; Hidden Markov models; Mutual information; Smoothing methods; Speech; Statistics; Tagging;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
Type :
conf
DOI :
10.1109/ICMLC.2004.1382038
Filename :
1382038
Link To Document :
بازگشت