DocumentCode :
3037563
Title :
Script recognition in images with complex backgrounds
Author :
Gllavata, Julinda ; Freisleben, Bernd
Author_Institution :
SFB/FK, Siegen Univ.
fYear :
2005
fDate :
21-21 Dec. 2005
Firstpage :
589
Lastpage :
594
Abstract :
The extraction of textual information from images and videos is an important task for automatic content-based indexing and retrieval purposes. To extract text from images or videos coming from unknown international sources, it is necessary to know the script beforehand in order to employ suitable text segmentation and optical character recognition (OCR) methods. In this paper, we present an approach for discriminating between Latin and Ideographic script. The proposed approach proceeds as follows: first, the text present in an image is localized. Then, a set of low-level features is extracted from the localized text image. Finally, based on the extracted features, the decision about the type of the script is made using a k-nearest neighbour classifier. Initial experimental results for a set of images containing text of different scripts demonstrate the good performance of the proposed solution
Keywords :
content-based retrieval; feature extraction; indexing; natural languages; optical character recognition; pattern classification; text analysis; Ideographic script; Latin script; automatic content-based indexing; content based retrieval; feature extraction; k-nearest neighbour classifier; optical character recognition; script recognition; text segmentation; textual information extraction; Content based retrieval; Data mining; Feature extraction; Image recognition; Image retrieval; Image segmentation; Indexing; Information retrieval; Optical character recognition software; Videos;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Information Technology, 2005. Proceedings of the Fifth IEEE International Symposium on
Conference_Location :
Athens
Print_ISBN :
0-7803-9313-9
Type :
conf
DOI :
10.1109/ISSPIT.2005.1577163
Filename :
1577163
Link To Document :
بازگشت