• DocumentCode
    3037563
  • Title

    Script recognition in images with complex backgrounds

  • Author

    Gllavata, Julinda ; Freisleben, Bernd

  • Author_Institution
    SFB/FK, Siegen Univ.
  • fYear
    2005
  • fDate
    21-21 Dec. 2005
  • Firstpage
    589
  • Lastpage
    594
  • Abstract
    The extraction of textual information from images and videos is an important task for automatic content-based indexing and retrieval purposes. To extract text from images or videos coming from unknown international sources, it is necessary to know the script beforehand in order to employ suitable text segmentation and optical character recognition (OCR) methods. In this paper, we present an approach for discriminating between Latin and Ideographic script. The proposed approach proceeds as follows: first, the text present in an image is localized. Then, a set of low-level features is extracted from the localized text image. Finally, based on the extracted features, the decision about the type of the script is made using a k-nearest neighbour classifier. Initial experimental results for a set of images containing text of different scripts demonstrate the good performance of the proposed solution
  • Keywords
    content-based retrieval; feature extraction; indexing; natural languages; optical character recognition; pattern classification; text analysis; Ideographic script; Latin script; automatic content-based indexing; content based retrieval; feature extraction; k-nearest neighbour classifier; optical character recognition; script recognition; text segmentation; textual information extraction; Content based retrieval; Data mining; Feature extraction; Image recognition; Image retrieval; Image segmentation; Indexing; Information retrieval; Optical character recognition software; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Information Technology, 2005. Proceedings of the Fifth IEEE International Symposium on
  • Conference_Location
    Athens
  • Print_ISBN
    0-7803-9313-9
  • Type

    conf

  • DOI
    10.1109/ISSPIT.2005.1577163
  • Filename
    1577163