Title :
Multi-polarity text segmentation using graph theory
Author :
Li, Jia ; Tian, Yonghong ; Huang, Tiejun ; Gao, Wen
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing
Abstract :
Text segmentation, or named text binarization, is usually an essential step for text information extraction from images and videos. However, most existing text segmentation methods have difficulties in extracting multi-polarity texts, where multi-polarity texts mean those texts with multiple colors or intensities in the same line. In this paper, we propose a novel algorithm for multi- polarity text segmentation based on graph theory. By representing a text image with an undirected weighted graph and partitioning it iteratively, multi-polarity text image can be effectively split into several single-polarity text images. As a result, these text images are then segmented by single-polarity text segmentation algorithms. Experiments on thousands of multi-polarity text images show that our algorithm can effectively segment multi-polarity texts.
Keywords :
directed graphs; image colour analysis; image segmentation; text analysis; graph theory; multipolarity text image; multipolarity text segmentation; named text binarization; single-polarity text segmentation algorithms; text information extraction; undirected weighted graph; Color; Computers; Data mining; Graph theory; Image segmentation; Iterative algorithms; Partitioning algorithms; Support vector machine classification; Support vector machines; Videos; Graph theory; Image segmentation; Text processing;
Conference_Titel :
Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4244-1765-0
Electronic_ISBN :
1522-4880
DOI :
10.1109/ICIP.2008.4712428