DocumentCode :
595479
Title :
Label-noise reduction with support vector machines
Author :
Fefilatyev, S. ; Shreve, Matthew ; Kramer, K. ; Hall, Leonard ; Goldgof, Dmitry ; Kasturi, Rangachar ; Daly, K. ; Remsen, A. ; Bunke, Horst
fYear :
2012
fDate :
11-15 Nov. 2012
Firstpage :
3504
Lastpage :
3508
Abstract :
The problem of detection of label-noise in large datasets is investigated. We consider applications where data are susceptible to label error and a human expert is available to verify a limited number of such labels in order to cleanse the data. We show the support vectors of a Support Vector Machine (SVM) contain almost all of these noisy labels. Therefore, the verification of support vectors allows efficient cleansing of the data. Empirical results are presented for two experiments. In the first experiment, two datasets from the character recognition domain are used and artificial random noise is applied in their labeling. In the second experiment, a large dataset of plankton images, that contains inadvertent human label error, is considered. It is shown that up to 99% of all label-noise from such datasets can be detected by verifying just the support vectors of the SVM classifier.
Keywords :
character recognition; data handling; image classification; support vector machines; SVM classifier; artificial random noise; character recognition domain; data cleansing; human expert; human label error; label-noise reduction; large datasets; noisy labels; plankton images; support vector machines; support vectors verification; Humans; Machine learning; Noise; Noise measurement; Support vector machines; Training; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location :
Tsukuba
ISSN :
1051-4651
Print_ISBN :
978-1-4673-2216-4
Type :
conf
Filename :
6460920
Link To Document :
بازگشت