DocumentCode :
1632367
Title :
Finding Images and Line-Drawings in Document-Scanning Systems
Author :
Baluja, Shumeet ; Covell, Michele
Author_Institution :
Google, Inc., WA, USA
fYear :
2009
Firstpage :
1096
Lastpage :
1100
Abstract :
The system presented in this paper finds images and line-drawings in scanned pages; it is a crucial processing step in the creation of a large-scale system to detect and index images found in books and historic documents. Within the scanned pages that contain both text and images, the images are found through the use of SIFT-based local-features applied to the complete scanned-page. This is followed by a novel learning system to categorize the found SIFT features into either text or image. The discrimination is based on using multiple classifiers trained via AdaBoost. Through the use of this system, we improve image detection by finding more line-drawings, graphics, and photographs, as well as by reducing the number of spurious detections due to misclassified text, discolorations, and scanning artifacts.
Keywords :
Ada; document image processing; indexing; large-scale systems; learning (artificial intelligence); object detection; pattern classification; AdaBoost; SIFT-based local-features; classifiers; document-scanning systems; image detection; image finding; image indexing; large-scale system; line-drawings; novel learning system; Art; Character recognition; Computational intelligence; Image analysis; Image segmentation; Informatics; Laboratories; Robustness; Telecommunication computing; Text analysis; document scanning; historic books; historic manuscripts; local descriptors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
ISSN :
1520-5363
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2009.106
Filename :
5277481
Link To Document :
بازگشت