Title :
The research of proofreading for the Uigur character
Author_Institution :
Dept. of Electron., Xinjiang Univ., Urmuqi, China
Abstract :
Uigur language belongs to the Altaic language branch of the Turkic language. This paper analyses common error types of pre-proofread text of Uigur,and discusses how to establish a corpus, rule base, part-of-speech tagging and word class ambiguity syncopate etc. It also presents a method with part-of-speech tagging of word class grammatical character, a combined method of a rule base and corpus statistics
Keywords :
grammars; text analysis; text editing; Altaic language; Turkic language; Uigur language; corpus; corpus statistics; part-of-speech tagging; pre-proofread text errors; proofreading; rule base; word class ambiguity syncopate; word class grammatical character; Character recognition; Computer errors; Dictionaries; Educational institutions; Information science; Keyboards; Libraries; Natural languages; Speech analysis; Tagging;
Conference_Titel :
Systems, Man, and Cybernetics, 2001 IEEE International Conference on
Conference_Location :
Tucson, AZ
Print_ISBN :
0-7803-7087-2
DOI :
10.1109/ICSMC.2001.973026