DocumentCode
2754892
Title
Automatic Morphological Tagging of Contemporary Uighur Corpus
Author
Altenbek, Gulila
Author_Institution
Inf. Sci. & Eng. Colleges, Xinjiang Univ., Urumqi
fYear
2006
fDate
16-18 Sept. 2006
Firstpage
557
Lastpage
560
Abstract
In this paper, we propose methods of Uighur word lemmatization by using of morphemic analysis and word´s structural analysis, integrating morphological processing and part-of speech (POS) tagging, so as to find linguistic information and automatic POS of Uighur Corpus as the final purpose. For the regular words, the accuracy of word lemmatization reach 85% and POS reach 80%
Keywords
formal languages; word processing; Uighur Corpus; Uighur word lemmatization; automatic morphological tagging; linguistic information; morphemic analysis; morphological processing; part of speech tagging; word structural analysis; Data mining; Educational institutions; Information analysis; Information science; Natural languages; Performance analysis; Shape; Speech analysis; Speech processing; Tagging; POS; Uighur; Word Lemmatization; affix; stem;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration, 2006 IEEE International Conference on
Conference_Location
Waikoloa Village, HI
Print_ISBN
0-7803-9788-6
Type
conf
DOI
10.1109/IRI.2006.252474
Filename
4018551
Link To Document