مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-task deep learning for image understanding

DocumentCode :

1796084

Title :

Multi-task deep learning for image understanding

Author :

Bo Yu ; Lane, Ian

Author_Institution :

State Key Lab. of Remote Sensing Sci., Inst. of Remote Sensing & Digital Earth, Beijing, China

fYear :

2014

fDate :

11-14 Aug. 2014

Firstpage :

Lastpage :

Abstract :

Deep learning models can obtain state-of-the-art performance across many speech and image processing tasks, often significantly outperforming earlier methods. In this paper, we attempt to further improve the performance of these models by introducing multi-task training, in which a combined deep learning model is trained for two inter-related tasks. We show that by introducing a secondary task (such as shape identification in the object classification task) we are able to significantly improve the performance of the main task for which the model is trained. Using public datasets we evaluated our approach on two image understanding tasks, image segmentation and object classification. On the image segmentation task, we observed that the multi-task model almost doubled the accuracy of segmentation at the pixel-level (from 18.7% to 35.6%) compared to the single task model, and improved the performance of face-detection by 10.2% (from 70.1% to 80.3%). For the object classification task, we observed a 2.1% improvement in classification accuracy (from 91.6% to 93.7%) compared to a single-task model. The proposed multi-task models obtained significantly higher accuracies than previously published results on these datasets, obtaining 22.0% and 6.2% higher accuracies on the face-detection and object classification tasks respectively. These results demonstrate the effectiveness of multi-task training of deep learning models for image understanding tasks.

Keywords :

image segmentation; learning (artificial intelligence); pattern classification; face detection; image processing; image segmentation; image understanding; multitask deep learning; multitask training; object classification; speech processing; Accuracy; Data models; Face; Face detection; Solid modeling; Three-dimensional displays; Training; deep learning; image segmentation; multi-task learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of

Conference_Location :

Tunis

Type :

conf

DOI :

10.1109/SOCPAR.2014.7007978

Filename :

7007978

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1796084