• DocumentCode
    2754892
  • Title

    Automatic Morphological Tagging of Contemporary Uighur Corpus

  • Author

    Altenbek, Gulila

  • Author_Institution
    Inf. Sci. & Eng. Colleges, Xinjiang Univ., Urumqi
  • fYear
    2006
  • fDate
    16-18 Sept. 2006
  • Firstpage
    557
  • Lastpage
    560
  • Abstract
    In this paper, we propose methods of Uighur word lemmatization by using of morphemic analysis and word´s structural analysis, integrating morphological processing and part-of speech (POS) tagging, so as to find linguistic information and automatic POS of Uighur Corpus as the final purpose. For the regular words, the accuracy of word lemmatization reach 85% and POS reach 80%
  • Keywords
    formal languages; word processing; Uighur Corpus; Uighur word lemmatization; automatic morphological tagging; linguistic information; morphemic analysis; morphological processing; part of speech tagging; word structural analysis; Data mining; Educational institutions; Information analysis; Information science; Natural languages; Performance analysis; Shape; Speech analysis; Speech processing; Tagging; POS; Uighur; Word Lemmatization; affix; stem;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration, 2006 IEEE International Conference on
  • Conference_Location
    Waikoloa Village, HI
  • Print_ISBN
    0-7803-9788-6
  • Type

    conf

  • DOI
    10.1109/IRI.2006.252474
  • Filename
    4018551