Title :
N-scheme model: An approach towards reducing Arabic language sparseness
Author :
Mohamed Achraf Ben Mohamed;Sarra Zrigui;Anis Zouaghi;Mounir Zrigui
Author_Institution :
Faculty of Sciences of Monastir, Tunisia
Abstract :
In addition to traditional characteristics of natural languages like implicitly or ambiguity or imprecision, Arabic is known by its sparseness which explains the difficulty of its automatic processing. But on the other hand, Arabic language is characterized by an interesting property; lemmas are generated by derivation based on roots and schemes. Schemes are kinds of molds allowing changing the form of root by actions involving elongation, or repetition, or even adding characters. Schemes can also give meaning to generated word. In this work we have studied the statistical characteristics of the Arabic language at the level of schemes; we have emphasized the attenuation of the sparseness at this level. Then we explored the possibility of building natural language processing tools for Arabic by relying on schemes. We discovered that schemes have great potential in building accurate natural language processing tools for Arabic. Based entirely or partially on schemes we built an n-scheme statistical model and a text classification system.
Keywords :
"Natural language processing","Silicon","Buildings","Computational modeling","Neural networks","Vocabulary","Training"
Conference_Titel :
Information & Communication Technology and Accessibility (ICTA), 2015 5th International Conference on
DOI :
10.1109/ICTA.2015.7426895