• DocumentCode
    3501552
  • Title

    Statistical uyhur POS tagging with TAG predictor for unknown words

  • Author

    Tian, Shengwei ; Ibrahim, Turgun ; Umal, Hasan ; Yu, Long

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Xinjiang Univ., Urumqi, China
  • Volume
    4
  • fYear
    2009
  • fDate
    8-9 Aug. 2009
  • Firstpage
    60
  • Lastpage
    62
  • Abstract
    Automatic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applications. Trigrams tags is an efficient statistical part-of-speech tagging. This paper describes a POS tagging for Uyhur text based on hidden Markov model using trigrams tags. We describe the basic model of Trigrams Tags, the techniques used for smoothing to address the sparse data problem and a tag predictor for unknown words. A comparison has shown that our approach performs significantly for the Uyhur tested corpora.
  • Keywords
    hidden Markov models; identification technology; speech processing; text analysis; TAG predictor; automatic text tagging; hidden Markov model; natural language processing applications; sparse data problem; statistical Uyhur POS tagging; statistical part-of-speech tagging; text corpora; trigrams tags; Computer networks; Context modeling; Hidden Markov models; Natural language processing; Predictive models; Probability; Smoothing methods; Support vector machines; Tagging; Technical Activities Guide -TAG; Part of Speech; Tag Predictor; Trigrams tags; Unknown Words;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing, Communication, Control, and Management, 2009. CCCM 2009. ISECS International Colloquium on
  • Conference_Location
    Sanya
  • Print_ISBN
    978-1-4244-4247-8
  • Type

    conf

  • DOI
    10.1109/CCCM.2009.5267823
  • Filename
    5267823