Title :
Label Detection and Recognition for USPTO Images Using Convolutional K-Means Feature Quantization and Ada-Boost
Author :
Siyu Zhu ; Zanibbi, Richard
Author_Institution :
Center for Imaging Sci., Rochester Inst. of Technol., Rochester, NY, USA
Abstract :
We utilize Coates´ unsupervised feature learning method and AdaBoost to detect and recognize part label regions in patent drawings. Image patches are harvested from training data, and features are learned from patterns in image patches. Angle distances between samples and feature banks are computed, and used in AdaBoost classifier. We extract image patches with different sizes to counter the scale problem. An ensemble AdaBoost is used to classify pixels as text or background. Meta-Boost is introduced to improve performance. The pixel level detections are then grouped into ´Connected Components´. Several denoise methods are applied, followed by ´Tesseract´ OCR. Our system achieves competitive performance without using strong prior knowledge.
Keywords :
image classification; text detection; unsupervised learning; AdaBoost classifier; USPTO images; connected component; convolutional k-means feature quantization; image patches; label detection; label recognition; patent drawings; pixel level detections; unsupervised feature learning method; Accuracy; Feature extraction; Image recognition; Optical character recognition software; Patents; Training; Vectors; AdaBoost; feature learning; text detection;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.130