DocumentCode
3501552
Title
Statistical uyhur POS tagging with TAG predictor for unknown words
Author
Tian, Shengwei ; Ibrahim, Turgun ; Umal, Hasan ; Yu, Long
Author_Institution
Coll. of Inf. Sci. & Eng., Xinjiang Univ., Urumqi, China
Volume
4
fYear
2009
fDate
8-9 Aug. 2009
Firstpage
60
Lastpage
62
Abstract
Automatic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applications. Trigrams tags is an efficient statistical part-of-speech tagging. This paper describes a POS tagging for Uyhur text based on hidden Markov model using trigrams tags. We describe the basic model of Trigrams Tags, the techniques used for smoothing to address the sparse data problem and a tag predictor for unknown words. A comparison has shown that our approach performs significantly for the Uyhur tested corpora.
Keywords
hidden Markov models; identification technology; speech processing; text analysis; TAG predictor; automatic text tagging; hidden Markov model; natural language processing applications; sparse data problem; statistical Uyhur POS tagging; statistical part-of-speech tagging; text corpora; trigrams tags; Computer networks; Context modeling; Hidden Markov models; Natural language processing; Predictive models; Probability; Smoothing methods; Support vector machines; Tagging; Technical Activities Guide -TAG; Part of Speech; Tag Predictor; Trigrams tags; Unknown Words;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing, Communication, Control, and Management, 2009. CCCM 2009. ISECS International Colloquium on
Conference_Location
Sanya
Print_ISBN
978-1-4244-4247-8
Type
conf
DOI
10.1109/CCCM.2009.5267823
Filename
5267823
Link To Document