• DocumentCode
    1441933
  • Title

    Long-Term Incremental Web-Supervised Learning of Visual Concepts via Random Savannas

  • Author

    Ewerth, Ralph ; Ballafkir, Khalid ; Mühling, Markus ; Seiler, Dominik ; Freisleben, Bernd

  • Author_Institution
    Dept. of Math. & Comput. Sci., Univ. of Marburg, Marburg, Germany
  • Volume
    14
  • Issue
    4
  • fYear
    2012
  • Firstpage
    1008
  • Lastpage
    1020
  • Abstract
    The idea of using image and video data available in the World-Wide Web (WWW) as training data for classifier construction has received some attention in the past few years. In this paper, we present a novel incremental and scalable web-supervised learning system that continuously learns appearance models for image categories with heterogeneous appearances and improves these models periodically. Simply specifying the name of the concept that has to be learned initializes the proposed system, and there is no further supervision afterwards. Textual and visual information on web sites are used to filter out irrelevant and misleading training images. To obtain a robust, flexible, and updatable way of learning, a novel learning framework is presented that relies on clustering in order to identify visual subclasses before using an ensemble of random forests, called random savanna, for subclass learning. Experimental results demonstrate that the proposed web-supervised learning approach outperforms a support vector machine (SVM), while at the same time being simply parallelizable in the training and testing phases.
  • Keywords
    Web sites; image classification; information filtering; learning (artificial intelligence); pattern clustering; random processes; text analysis; video retrieval; WWW; Web sites; World Wide Web; appearance models; classifier construction; clustering; heterogeneous appearances; image categories; image data; irrelevant training image filter; long-term incremental Web-supervised learning; misleading training image filter; random forest ensembles; random savannas; scalable Web-supervised learning system; subclass learning; textual information; training data; video data; visual concepts; visual information; visual subclass identification; Google; Semantics; Support vector machines; Training; Training data; Visualization; World Wide Web; Image classification; incremental learning; random forest; random savanna; web-supervised learning;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2012.2186956
  • Filename
    6146437