• DocumentCode
    76963
  • Title

    SEG-SSC: A Framework Based on Synthetic Examples Generation for Self-Labeled Semi-Supervised Classification

  • Author

    Triguero, Isaac ; Garcia, Salvador ; Herrera, Francisco

  • Author_Institution
    Dept. of Comput. Sci. & Artificial Intell., Univ. of Granada, Granada, Spain
  • Volume
    45
  • Issue
    4
  • fYear
    2015
  • fDate
    Apr-15
  • Firstpage
    622
  • Lastpage
    634
  • Abstract
    Self-labeled techniques are semi-supervised classification methods that address the shortage of labeled examples via a self-learning process based on supervised models. They progressively classify unlabeled data and use them to modify the hypothesis learned from labeled samples. Most relevant proposals are currently inspired by boosting schemes to iteratively enlarge the labeled set. Despite their effectiveness, these methods are constrained by the number of labeled examples and their distribution, which in many cases is sparse and scattered. The aim of this paper is to design a framework, named synthetic examples generation for self-labeled semi-supervised classification, to improve the classification performance of any given self-labeled method by using synthetic labeled data. These are generated via an oversampling technique and a positioning adjustment model that use both labeled and unlabeled examples as reference. Next, these examples are incorporated in the main stages of the self-labeling process. The principal aspects of the proposed framework are: 1) introducing diversity to the multiple classifiers used by using more (new) labeled data; 2) fulfilling labeled data distribution with the aid of unlabeled data; and 3) being applicable to any kind of self-labeled method. In our empirical studies, we have applied this scheme to four recent self-labeled methods, testing their capabilities with a large number of data sets. We show that this framework significantly improves the classification capabilities of self-labeled techniques.
  • Keywords
    learning (artificial intelligence); pattern classification; sampling methods; SEG-SSC; classification performance; labeled data distribution; labeled samples; oversampling technique; positioning adjustment model; self-labeled method; self-labeled semisupervised classification; self-learning process; supervised models; synthetic examples generation; synthetic labeled data; Cybernetics; Manifolds; Prediction algorithms; Prototypes; Reliability; Standards; Training; Co-training; self-labeled methods; semi- supervised classification; semi-supervised classification; synthetic examples;
  • fLanguage
    English
  • Journal_Title
    Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-2267
  • Type

    jour

  • DOI
    10.1109/TCYB.2014.2332003
  • Filename
    6847198