Title :
Comparison between two Arabic tagsets
Author :
Rashwan, Mohsen A A ; Khalil, Enas A H ; Rafea, Ahmed
Author_Institution :
Dept. of Electron. & Electr. Commun., Cairo Univ, Cairo, Egypt
Abstract :
Enhancing Arabic tagging is of great importance in many NLP applications. This paper presents a simple comparison tool that compares two powerful tagging systems for Arabic, the first one is the ASVM Tagger, by Diab M. et al,. The second one is RDI Arab Tagger that relies on simple powerful long n-grams probability estimation plus A*search algorithm for disambiguation, this comparison is done to superimpose points of excellence in Arab Tagger into ASVM tagger. From this comparison, mapper tool is implemented to convert from the fine grain Arab tagset (62 tags used by the ArabTagger) to the other course grain compact tagset of 24 tags Reduced Tagset (RTS) used by ASVM-Tagger. A combined system from the output of both is then formed, which gives an average accuracy higher than that of ASVM in our experiment, 95% of hybrid system versus 93% of ASVM system.
Keywords :
natural language processing; probability; support vector machines; tree searching; A* search algorithm; ASVM Tagger; ArabTagger; Arabic tagging; NLP applications; RDI Arab Tagger; long n-grams probability estimation; natural language processing; reduced tagset; Application software; Computer science; Data mining; Labeling; Machine learning; Natural languages; Power engineering and energy; Speech processing; Support vector machines; Tagging; A∗search algorithm; Automatic Support Vector Machine (ASVM); N-gram model; Part-of-Speech Tagging (POS); Reduced Tag Set (RTS);
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
DOI :
10.1109/NLPKE.2009.5313767