Author_Institution :
Coll. of Comput. Sci., Chongqing Univ., Chongqing, China
Abstract :
Most object classification models considered in image only exist positive samples and negative samples. In this paper, another type of sample exists, named “gray sample”, which belongs to neither positive samples nor negative samples, contains knowledge in other domains or scenes. The degree of “gray” is defined by semantic similarity between annotations and scenes. While the local descriptors represent one object by visual feature, and the annotations are semantic description of this object. One class of objects has exclusive concept and different descriptions in different scenes, “gray samples” is belong to one concept, but exist in other scenes by different manifestation. Only the similar scenes have commonality, which is a bridge connect knowledge in different scenes. To achieve goal of cross-scene learning and using “gray sample”, the similar degree of scenes is needed to be conducted, by computing the co-occurrence probability of annotations. In our model, following the bags-of-features (BoF) approach, a plenty of local descriptors are extracted from the annotation areas of images, the visual words are got through clustering those descriptors, and the proposed model is built upon the visual words. Using EM algorithm, construct model under different thresholds of correlation degree, and classify objects. The thresholds are important in decision whether the “gray samples” are suitable for training data, and key factor in classification performance. The experiments of object classification based on LabelMe dataset, the proposed model exhibits superior performances compared to the other existing methods.
Keywords :
expectation-maximisation algorithm; feature extraction; image classification; learning (artificial intelligence); pattern clustering; probability; EM algorithm; LabelMe dataset; annotation cooccurrence probability; bags-of-features approach; classification performance improvement; cross-scene learning; gray sample; local descriptor extraction; negative samples; object classification models; positive samples; semantic similarity; visual feature; visual words; Computational modeling; Educational institutions; Image segmentation; Object recognition; Semantics; Training; Visualization;