DocumentCode :
29220
Title :
Robust Text Detection in Natural Scene Images
Author :
Xu-Cheng Yin ; Xuwang Yin ; Kaizhu Huang ; Hong-Wei Hao
Author_Institution :
Dept. of Comput. Sci. & Technol., Univ. of Sci. & Technol. Beijing, Beijing, China
Volume :
36
Issue :
5
fYear :
2014
fDate :
May-14
Firstpage :
970
Lastpage :
983
Abstract :
Text detection in natural scene images is an important prerequisite for many content-based image analysis tasks. In this paper, we propose an accurate and robust method for detecting texts in natural scene images. A fast and effective pruning algorithm is designed to extract Maximally Stable Extremal Regions (MSERs) as character candidates using the strategy of minimizing regularized variations. Character candidates are grouped into text candidates by the single-link clustering algorithm, where distance weights and clustering threshold are learned automatically by a novel self-training distance metric learning algorithm. The posterior probabilities of text candidates corresponding to non-text are estimated with a character classifier; text candidates with high non-text probabilities are eliminated and texts are identified with a text classifier. The proposed system is evaluated on the ICDAR 2011 Robust Reading Competition database; the f-measure is over 76%, much better than the state-of-the-art performance of 71%. Experiments on multilingual, street view, multi-orientation and even born-digital databases also demonstrate the effectiveness of the proposed method.
Keywords :
learning (artificial intelligence); object detection; pattern clustering; probability; MSER; born-digital databases; clustering threshold; content-based image analysis; distance weights; f-measure; maximally stable extremal regions; minimizing regularized variations strategy; multilingual database; multiorientation database; natural scene images; posterior probabilities; pruning algorithm; robust text detection; self-training distance metric learning algorithm; single-link clustering algorithm; street view database; text classifier; Algorithm design and analysis; Clustering algorithms; Databases; Educational institutions; Measurement; Robustness; Vegetation; Computing Methodologies; Image Processing and Computer Vision; Scene Analysis; Scene text detection; Text processing; distance metric learning; maximally stable extremal regions; single-link clustering;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2013.182
Filename :
6613482
Link To Document :
بازگشت