Probabilistic learning from mislabelled data for multimedia content recognition

Author

Kakar, Pravin ; Chia, Alex Yong-Sang

Author_Institution

Inst. for Infocomm Res., Singapore, Singapore

fYear

2015

fDate

June 29 2015-July 3 2015

Firstpage

1

Lastpage

6

Abstract

There have been considerable advances in multimedia recognition recently as powerful computing capabilities and large, representative datasets become ubiquitous. A fundamental assumption of traditional recognition techniques is that the data available for training are accurately labelled. Given the scale and diversity of web data, it takes considerable annotation effort to reduce label noise to acceptable levels. In this work, we propose a novel method to work around this issue by utilizing approximate apriori estimates of the mislabelling probabilities to design a noise-aware learning framework. We demonstrate the proposed framework´s effectiveness on several datasets of various modalities and show that it is able to achieve high levels of accuracy even when faced with significant mislabelling in the data.

Keywords

Internet; image denoising; image recognition; learning (artificial intelligence); multimedia computing; Web data; computing capabilities; label noise reduction; mislabelled data; multimedia content recognition; noise-aware learning framework; probabilistic learning; representative datasets; Accuracy; Neural networks; Noise; Noise level; Noise measurement; Testing; Training; mislabelled data; multimedia content recognition; probabilistic learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo (ICME), 2015 IEEE International Conference on

Conference_Location

Turin

Type

conf

DOI

10.1109/ICME.2015.7177393

Filename

7177393