DocumentCode :
2259760
Title :
Comparison between two Arabic tagsets
Author :
Rashwan, Mohsen A A ; Khalil, Enas A H ; Rafea, Ahmed
Author_Institution :
Dept. of Electron. & Electr. Commun., Cairo Univ, Cairo, Egypt
fYear :
2009
fDate :
24-27 Sept. 2009
Firstpage :
1
Lastpage :
8
Abstract :
Enhancing Arabic tagging is of great importance in many NLP applications. This paper presents a simple comparison tool that compares two powerful tagging systems for Arabic, the first one is the ASVM Tagger, by Diab M. et al,. The second one is RDI Arab Tagger that relies on simple powerful long n-grams probability estimation plus A*search algorithm for disambiguation, this comparison is done to superimpose points of excellence in Arab Tagger into ASVM tagger. From this comparison, mapper tool is implemented to convert from the fine grain Arab tagset (62 tags used by the ArabTagger) to the other course grain compact tagset of 24 tags Reduced Tagset (RTS) used by ASVM-Tagger. A combined system from the output of both is then formed, which gives an average accuracy higher than that of ASVM in our experiment, 95% of hybrid system versus 93% of ASVM system.
Keywords :
natural language processing; probability; support vector machines; tree searching; A* search algorithm; ASVM Tagger; ArabTagger; Arabic tagging; NLP applications; RDI Arab Tagger; long n-grams probability estimation; natural language processing; reduced tagset; Application software; Computer science; Data mining; Labeling; Machine learning; Natural languages; Power engineering and energy; Speech processing; Support vector machines; Tagging; A∗search algorithm; Automatic Support Vector Machine (ASVM); N-gram model; Part-of-Speech Tagging (POS); Reduced Tag Set (RTS);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
Type :
conf
DOI :
10.1109/NLPKE.2009.5313767
Filename :
5313767
Link To Document :
بازگشت