Title :
Automatic Morphological Tagging of Contemporary Uighur Corpus
Author :
Altenbek, Gulila
Author_Institution :
Inf. Sci. & Eng. Colleges, Xinjiang Univ., Urumqi
Abstract :
In this paper, we propose methods of Uighur word lemmatization by using of morphemic analysis and word´s structural analysis, integrating morphological processing and part-of speech (POS) tagging, so as to find linguistic information and automatic POS of Uighur Corpus as the final purpose. For the regular words, the accuracy of word lemmatization reach 85% and POS reach 80%
Keywords :
formal languages; word processing; Uighur Corpus; Uighur word lemmatization; automatic morphological tagging; linguistic information; morphemic analysis; morphological processing; part of speech tagging; word structural analysis; Data mining; Educational institutions; Information analysis; Information science; Natural languages; Performance analysis; Shape; Speech analysis; Speech processing; Tagging; POS; Uighur; Word Lemmatization; affix; stem;
Conference_Titel :
Information Reuse and Integration, 2006 IEEE International Conference on
Conference_Location :
Waikoloa Village, HI
Print_ISBN :
0-7803-9788-6
DOI :
10.1109/IRI.2006.252474