Title :
Efficient Arabic text extraction and recognition using thinning and dataset comparison technique
Author :
Saudagar, Abdul Khader Jilani ; Mohammed, Habeeb Vulla ; Iqbal, Kamran ; Gyani, Yasir Javed
Author_Institution :
Coll. of Comput. & Inf. Sci., Al Imam Mohammad Ibn Saud Islamic Univ., Riyadh, Saudi Arabia
Abstract :
The objective of this research paper is to propose a novel technique for Arabic text extraction and recognition which is a part of research work aimed at developing a system for moving Arabic video text extraction for efficient content based indexing and searching. Numerous techniques were proposed in the past for text extraction but very few of them focus on Arabic text. All the earlier proposed implementations are not successful in attaining 100 % accuracy in text extraction and recognition process. The proposed technique is new and is based on thinning the given sample image containing Arabic text and splitting the resulting image horizontally (X-axis direction) from right to left in equal intervals. Compare each part of the image for equal number of white pixels to those of samples in the dataset. Upon matching, with the help of index value the corresponding character is stored in an array. This process is repeated by varying the splitting interval until all the characters in the sample image are recognized. To our knowledge, our research is the primary to address the above problem and propose a solution with increased retrieval accuracy and reduced computation time for Arabic text extraction and recognition.
Keywords :
content-based retrieval; image matching; natural language processing; text detection; Arabic text extraction process; Arabic text recognition process; Arabic video text extraction; computation time; content-based indexing; content-based searching; dataset comparison technique; index value; thinning; Character recognition; Data mining; Feature extraction; Image color analysis; Image edge detection; Indexes; Text recognition; Arabic Text Extraction; Arabic Text Recognition; Indexing; Searching; Traversing;
Conference_Titel :
Communication, Information & Computing Technology (ICCICT), 2015 International Conference on
Conference_Location :
Mumbai
Print_ISBN :
978-1-4799-5521-3
DOI :
10.1109/ICCICT.2015.7045725