DocumentCode
234363
Title
A new method to construct a statistical model for Arabic language
Author
Sadiqui, Ali ; Zinedine, Ahmed
Author_Institution
Fac. of Sci. Dhar El Mahrez, Sidi Mohamed Ben Abdellah Univ., Atlas, Morocco
fYear
2014
fDate
20-22 Oct. 2014
Firstpage
296
Lastpage
299
Abstract
Language models are one of the key components in modern systems of automatic language processing. In this study we present a new approach for the realization of a statistical model of Arabic language for non-vocalized texts. This approach allows to overcome the morphological complexity of the Arabic language and to address the limitations of existing morphological analyzers. Indeed the classic approach adopted by most of the morphological analyzers, bring the word out of its context and therefore generate several options for segmentation. Our solution proposes using trellises at a time to keep the possibilities of segmentation generated by the morphological analyzer and then create the model language. In order to realize this solution, we have used these tools: AraMorph and Lattice-Tool from the box SRILM and AT & WSF. The language was estimated from a corpus composed of 100 K words and has been tested on a corpus of 7 K words. The results and analysis are presented in this document.
Keywords
computational linguistics; natural language processing; statistical analysis; text analysis; Arabic language processing; language model; morphological analyzer; nonvocalized text; statistical model; Analytical models; Complexity theory; Context; Decision support systems; Arabic Laguage Model; Automatic Arabic Language processing; Non-vocalized text; Statistical Model;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Technology (CIST), 2014 Third IEEE International Colloquium in
Conference_Location
Tetouan
Print_ISBN
978-1-4799-5978-5
Type
conf
DOI
10.1109/CIST.2014.7016635
Filename
7016635
Link To Document