• DocumentCode
    438755
  • Title

    Pruning training sets for learning of object categories

  • Author

    Angelova, Anelia ; Abu-Mostafa, Yaser ; Perona, Pietro

  • Author_Institution
    Dept. of Comput. Sci., California Inst. of Technol., Pasadena, CA, USA
  • Volume
    1
  • fYear
    2005
  • fDate
    20-25 June 2005
  • Firstpage
    494
  • Abstract
    Training datasets for learning of object categories are often contaminated or imperfect. We explore an approach to automatically identify examples that are noisy or troublesome for learning and exclude them from the training set. The problem is relevant to learning in semi-supervised or unsupervised setting, as well as to learning when the training data is contaminated with wrongly labeled examples or when correctly labeled, but hard to learn examples, are present. We propose a fully automatic mechanism for noise cleaning, called ´data pruning´ and demonstrate its success on learning of human faces. It is not assumed that the data or the noise can be modeled or that additional training examples are available. Our experiments show that data pruning can improve on generalization performance for algorithms with various robustness to noise. It outperforms methods with regularization properties and is superior to commonly applied aggregation methods, such as bagging.
  • Keywords
    face recognition; learning (artificial intelligence); noise; object detection; aggregation method; data pruning; noise cleaning; object category learning; training set pruning; Application software; Cleaning; Computer science; Computer vision; Data mining; Face; Humans; Labeling; Noise robustness; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
  • ISSN
    1063-6919
  • Print_ISBN
    0-7695-2372-2
  • Type

    conf

  • DOI
    10.1109/CVPR.2005.283
  • Filename
    1467308