Title : 
Linguistic integration information in the AABATAS Arabic text analysis system
         
        
            Author : 
Kanoun, Slim ; Ennaji, Adellatif ; Lecourtier, Yves ; Alimi, Adel M.
         
        
            Author_Institution : 
Perception Syst. Inf. Lab., Rouen Univ., Mont Saint Aignan, France
         
        
        
        
        
        
            Abstract : 
An Arabic text analysis system called AABATAS (affixal approach-based Arabic text analysis system) is proposed. AABATAS recognizes and categorizes the words while identifying their morphological and grammatical characteristics. It is based on a new approach for Arabic word recognition called affixal approach. This affixal approach is guided by the structural properties of language. A dynamic decomposition-recognition mechanism is used in our system and leads to generate a set of reliable solutions for each word. This mechanism attempts to identify, the word basic morphemes: the prefix, the infix, the suffix and the root contrary to the existing approaches that are usually based on the recognition of the whole word or the pseudo-word or the letter. In this paper, we briefly present the general characteristics of Arabic texts as well as a succinct survey of the existing approaches used for their recognition. We then describe the structural properties of the Arabic language and the two systems based on these last properties. The first one concerns a word recognition process and the second is devoted to text analysis. We finally show two experimental results; one on a data set of 545 words and another on a text example.
         
        
            Keywords : 
computational linguistics; handwritten character recognition; AABATAS Arabic text analysis system; Arabic word recognition; affixal approach-based Arabic text analysis system; dynamic decomposition-recognition mechanism; grammatical characteristics; linguistic integration information; morphological characteristics; structural properties; word recognition process; Character recognition; Databases; Laboratories; Machine intelligence; Optical character recognition software; Optical sensors; Text analysis; Text recognition; Vocabulary; Writing;
         
        
        
        
            Conference_Titel : 
Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on
         
        
            Print_ISBN : 
0-7695-1692-0
         
        
        
            DOI : 
10.1109/IWFHR.2002.1030941