Title :
A comparative study on Arabic POS tagging using Quran corpus
Author :
Alashqar, Abdelkareem M.
Author_Institution :
Fac. of Inf. Technol., Islamic Univ. of Gaza, Gaza, Palestinian Authority
Abstract :
POS tagging is the process of computationally assigning correct part of speech to each word of a given input text depending on the context. Different POS tagging techniques in the literature have been developed and experimented mostly for English language. Some of the same work has been done for Arabic language. Comparative studies on POS tagging for Arabic language are relatively unexplored. In this paper we compare the performance of some POS tagging techniques for Arabic using Quran corpus. These techniques include N-Gram, Brill, HMM, and TnT taggers. The comparison experiments have been done on diacritized and undiacritized classical Arabic. We tried to see which technique maximizes the performance with our case.
Keywords :
grammars; identification technology; natural languages; text analysis; Arabic POS tagging techniques; Arabic language; Brill; English language; HMM; N-gram; Quran corpus; TnT taggers; part of speech; Accuracy; Context; Hidden Markov models; Informatics; Natural language processing; Speech; Tagging; NLP; Natural Language Processing; POS; Part of Speech Tagging; Quran Corpus;
Conference_Titel :
Informatics and Systems (INFOS), 2012 8th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4673-0828-1