Title :
Combining classifiers for supertagging Arabic texts
Author :
Ben Othmane Zribi, Chiraz ; Ben Fraj, Feriel ; Ben Ahmed, Mohamed
Author_Institution :
RIADI-GDL Lab., ENSI, La Manouba, Tunisia
Abstract :
This paper deals with supertagging Arabic texts with ArabTAG formalism, a semi-lexicalised grammar based on TAG and adapted for Arabic. Supertagging is a very useful task because it reduces and speeds the work of parsing. We view this problem as a classification task where elementary structures supertags (classes) are affected to words in a given sentence according to their description (morpho-syntactic and contextual information). We propose to combine three classifiers: Naïve Bayes, k-Nearest Neighbors (k-NN) and Decision tree by a voting procedure. The primary results were satisfactory as we obtained an accuracy rate of 76% although the small size of our training corpus (5,000 words) and the difficulties related to Arabic language specificities.
Keywords :
Bayes methods; decision trees; learning (artificial intelligence); natural language processing; pattern classification; text analysis; ArabTAG formalism; Arabic text supertagging; Naïve Bayes classifiers; contextual information; decision tree classifiers; elementary structures supertags; k-nearest neighbors classifiers; morphosyntactic description; semilexicalised grammar; Grammar; ArabTAG; Arabic language; Supertagging; classification; ensemble learning; machine learning; tree adjoining grammar;
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
DOI :
10.1109/NLPKE.2010.5587841