DocumentCode :
185671
Title :
Reading numbers in natural scene images with convolutional neural networks
Author :
Qiang Guo ; Jun Lei ; Dan Tu ; Guohui Li
Author_Institution :
Dept. of Inf. Syst. & Manage., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2014
fDate :
18-19 Oct. 2014
Firstpage :
48
Lastpage :
53
Abstract :
Reading text from natural images is a hard computer vision task. We present a method for applying deep convolutional neural networks to recognize numbers in natural scene images. In this paper, we proposed a noval method to eliminating the need of explicit segmentation when deal with multi-digit number recognition in natural scene images. Convolution Neural Network(CNN) requires fixed dimensional input while number images contain unknown amount of digits. Our method integrats CNN with probabilistic graphical model to deal with the problem. We use hidden Markov model(HMM) to model the image and use CNN to model digits appearance. This method combines the advantages of both the two models and make them fit to the problem. By using this method we can perform the training and recognition procedure both at word level. There is no explicit segmentation operation at all which save lots of labour for sophisticated segmentation algorithm design or finegrained character labeling. Experiments show that deep CNN can dramaticly improve the performance compared with using Gaussian Mixture model as the digit model. We obtaied competitive results on the street view house number(SVHN) dataset.
Keywords :
Gaussian processes; character recognition; computer vision; hidden Markov models; image segmentation; mixture models; neural nets; probability; CNN; Gaussian mixture model; HMM; SVHN dataset; computer vision task; convolutional neural networks; explicit segmentation; hidden Markov model; multidigit number recognition; natural scene images; probabilistic graphical model; street view house number; Hidden Markov models; Image recognition; Image segmentation; Neural networks; Text recognition; Training; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Security, Pattern Analysis, and Cybernetics (SPAC), 2014 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4799-5352-3
Type :
conf
DOI :
10.1109/SPAC.2014.6982655
Filename :
6982655
Link To Document :
بازگشت