Title :
Combined Classification for Extracting Named Entities from Arabic Texts
Author :
F?riel Ben Fraj ;Chiraz Ben Othmane Zribi;Wiem Kouki
Author_Institution :
RIADI Lab., La Manouba Univ., Tunisia
fDate :
4/1/2015 12:00:00 AM
Abstract :
In this paper, we describe an approach for extracting named entities from Arabic texts. Arabic language is hard to process since its characteristics that influence, even, the NE extraction. For our case, we consider that the named entities extraction can be assimilated to a typical classification problem. Indeed, this extraction consists of searching for text portions that can be classified in a NE class (Person, Locality or Organization). Thus, we choose to use a supervised learning approach and employ the BIO tagging format that can solve the twin problems of segmentation and categorization. In addition, singular classifier cannot give good results for all types of contexts. Thus, we adopt a set of weighted classifiers which we combined through a voting procedure. In order to appreciate properly the performance of our system, we perform two types of tests: with and without morphological attributes. We consider that the results are highly satisfactory especially with a accuracy that exceeds 89% for both Person and Locality classes.
Keywords :
"Organizations","Tagging","Context","Training","Supervised learning","Pragmatics","Robustness"
Conference_Titel :
Arabic Computational Linguistics (ACLing), 2015 First International Conference on
Print_ISBN :
978-1-4673-9154-2
DOI :
10.1109/ACLing.2015.15