DocumentCode :
2492487
Title :
Text-Aided Image Classification: Using Labeled Text from Web to Help Image Classification
Author :
Lin, Yuan ; Chen, Yuqiang ; Xue, Guirong ; Yu, Yong
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
fYear :
2010
fDate :
6-8 April 2010
Firstpage :
267
Lastpage :
273
Abstract :
As more and more multimedia data become available on the Web, mining on those data is playing an increasingly important role in Web applications. In this paper, we investigate the interplay between multimedia data mining and text data mining. Specifically, in an approach we called text-aided image classification (TAIC), we address the problem of image classification with very limited amount of labeled images and a large amount of auxiliary labeled text data. This problem is important in practice, since currently on the Web, labeled text data are usually much more than image data. To solve the problem, based on the “bag-of-words” view and the Naive Bayes classification model, we focus our attention on the estimation of the image feature distribution under given concept. We extend the Naive Bayes algorithm by considering a mapping that maps the most discriminative text features into the image feature space. This feature mapping is estimated based on the text-image cooccurrence data on the Web, acting like a bridge that connects text and image knowledge. With this process, we estimate target image feature distribution from a text model based on sufficient labeled data. Our empirical results on real world data sets show that our method makes a good approximation of the image feature distribution when trained with abundant labeled images. In the case amount of labeled images is very limited, the classification performance is improved by using auxiliary labeled text data, which shows that our method can indeed integrate text and image knowledge in a simple yet effective way.
Keywords :
Bayes methods; Internet; data mining; image classification; multimedia systems; pattern classification; text analysis; Naive Bayes classification model; auxiliary labeled text data; data mining; image classification; image feature distribution; multimedia data; text-aided image classification; Australia; Computer displays; Engines; Image classification; Petroleum;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Conference (APWEB), 2010 12th International Asia-Pacific
Conference_Location :
Busan
Print_ISBN :
978-1-7695-4012-2
Electronic_ISBN :
978-1-4244-6600-9
Type :
conf
DOI :
10.1109/APWeb.2010.49
Filename :
5474126
Link To Document :
بازگشت