Author_Institution :
Res. Groups on Intell. Machines, Univ. of Sfax, Sfax, Tunisia
Abstract :
Nowadays, Internet communication and especially informal Internet communication such as social networks, blogs, etc. is directing politic, economic, financial and social environments all over the world. Consequently, Internet monitoring is taking more and more scale particularly in Tunisia suffering from unsteadiness since the politic revolution in 2011. In a Tunisian context, Internet communication is characterized by the increasing use of aeb language (i.e. an Arabic dialect called Tunisian Arabic). Therefore, Tunisian Internet monitoring needs primarily aeb language processing tools, especially an aeb lexicon. However, few aeb lexicon were developed seen the lack of written resources. Some of these lexicons are created from Arabic lexicons. They cover aeb lexicon originally Arabic and ignore the large borrowed aeb lexicon. Others are build using the informal Web. In fact, they need a rigorous linguistic verification, correction and validation. In this case, we suggest building a standard, large and robust Wordnet taking in charge phonetic. Our Wordnet is created by the expand approach used for EuroWordnet building as in [12], based on the bilingual English-Tunisian Arabic dictionary Peace corps dictionary prepared by the linguists: R. Ben abdelkader, A. Ayed and A. Naouar [13], and the last version of Princeton Wordnet PWN 3.1. Moreover, it is modelized according to ISO-LMF by a switable Wordnet-LMF model for aeb language. In this paper, we present aeb wordnet building approach, describe its current state and propose extensions.
Keywords :
"Buildings","Dictionaries","Internet","Semantics","Adaptation models","Context","Standards"