DocumentCode
3488307
Title
Greedy Search for Active Learning of OCR
Author
Agarwal, Abhishek ; Garg, Radhika ; Chaudhury, Santanu
Author_Institution
Dept. of Electr. Eng., Indian Inst. of Technol., Delhi, New Delhi, India
fYear
2013
fDate
25-28 Aug. 2013
Firstpage
837
Lastpage
841
Abstract
Active learning and crowd sourcing are becoming increasingly popular in the machine learning community for fast and cost effective generation of labels for large volumes of data. However, such labels may be noisy. So, it becomes important to ignore the noisy labels for building of a good classifier. We propose a framework for finding the best possible augmentation of a classifier for the character recognition problem using minimum number of crowd labeled samples. The approach inherently rejects the noisy data and tries to accept a subset of correctly labeled data to maximize the classifier performance.
Keywords
image classification; learning (artificial intelligence); optical character recognition; search problems; OCR; active learning; character recognition problem; classifier; crowd labeled samples; greedy search; noisy data rejection; optical character recognition; Accuracy; Character recognition; Noise; Noise measurement; Optical character recognition software; Support vector machines; Training; Character recognition; Indian scripts; active learning; crowd sourcing; greedy search; incremental SVM;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location
Washington, DC
ISSN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2013.171
Filename
6628736
Link To Document