DocumentCode :
1796084
Title :
Multi-task deep learning for image understanding
Author :
Bo Yu ; Lane, Ian
Author_Institution :
State Key Lab. of Remote Sensing Sci., Inst. of Remote Sensing & Digital Earth, Beijing, China
fYear :
2014
fDate :
11-14 Aug. 2014
Firstpage :
37
Lastpage :
42
Abstract :
Deep learning models can obtain state-of-the-art performance across many speech and image processing tasks, often significantly outperforming earlier methods. In this paper, we attempt to further improve the performance of these models by introducing multi-task training, in which a combined deep learning model is trained for two inter-related tasks. We show that by introducing a secondary task (such as shape identification in the object classification task) we are able to significantly improve the performance of the main task for which the model is trained. Using public datasets we evaluated our approach on two image understanding tasks, image segmentation and object classification. On the image segmentation task, we observed that the multi-task model almost doubled the accuracy of segmentation at the pixel-level (from 18.7% to 35.6%) compared to the single task model, and improved the performance of face-detection by 10.2% (from 70.1% to 80.3%). For the object classification task, we observed a 2.1% improvement in classification accuracy (from 91.6% to 93.7%) compared to a single-task model. The proposed multi-task models obtained significantly higher accuracies than previously published results on these datasets, obtaining 22.0% and 6.2% higher accuracies on the face-detection and object classification tasks respectively. These results demonstrate the effectiveness of multi-task training of deep learning models for image understanding tasks.
Keywords :
image segmentation; learning (artificial intelligence); pattern classification; face detection; image processing; image segmentation; image understanding; multitask deep learning; multitask training; object classification; speech processing; Accuracy; Data models; Face; Face detection; Solid modeling; Three-dimensional displays; Training; deep learning; image segmentation; multi-task learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of
Conference_Location :
Tunis
Type :
conf
DOI :
10.1109/SOCPAR.2014.7007978
Filename :
7007978
Link To Document :
بازگشت