• DocumentCode
    1450830
  • Title

    New Semi-Supervised Classification Method Based on Modified Cluster Assumption

  • Author

    Yunyun Wang ; Songcan Chen ; Zhi-Hua Zhou

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Nanjing Univ. of Aeronaut. & Astronaut., Nanjing, China
  • Volume
    23
  • Issue
    5
  • fYear
    2012
  • fDate
    5/1/2012 12:00:00 AM
  • Firstpage
    689
  • Lastpage
    702
  • Abstract
    The cluster assumption, which assumes that “similar instances should share the same label,” is a basic assumption in semi-supervised classification learning, and has been found very useful in many successful semi-supervised classification methods. It is rarely noticed that when the cluster assumption is adopted, there is an implicit assumption that every instance should have a crisp class label assignment. In real applications, however, there are cases where it is difficult to tell that an instance definitely belongs to one class and does not belong to other neighboring classes. In such cases, it is more adequate to assume that “similar instances should share similar label memberships” rather than sharing a crisp label assignment. Here “label memberships” can be represented as a vector, where each element corresponds to a class, and the value at the element expresses the likelihood of the concerned instance belonging to the class. By adopting this modified cluster assumption, in this paper we propose a new semi-supervised classification method, that is, semi-supervised classification based on class membership (SSCCM). Specifically, we try to solve the decision function and adequate label memberships for instances simultaneously, and constrain that an instance and its “local weighted mean” (LWM) share the same label membership vector, where the LWM is a robust image of the instance, constructed by calculating the weighted mean of its neighboring instances. We formulate the problem in a unified objective function for the labeled, unlabeled data and their LWMs based on the square loss function, and take an alternating iterative strategy to solve it, in which each step generates a closed-form solution, and the convergence is guaranteed. The solution will provide both the decision function and the label membership function for classification, their classification results can verify each other, and the relia- ility of semi-supervised classification learning might be enhanced by checking the consistency between those two predictions. Experiments show that SSCCM obtains encouraging results compared to state-of-the-art semi-supervised classification methods.
  • Keywords
    iterative methods; learning (artificial intelligence); pattern classification; pattern clustering; LWM; SSCCM; cluster assumption; crisp class label assignment; decision function; implicit assumption; iterative strategy; label membership vector; local weighted mean; semisupervised classification based on class membership; semisupervised classification learning; square loss function; Clustering algorithms; Encoding; Kernel; Learning systems; Optimization; Reliability; Vectors; Cluster assumption; iteration; label membership function; local weighted mean; semi-supervised classification;
  • fLanguage
    English
  • Journal_Title
    Neural Networks and Learning Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2162-237X
  • Type

    jour

  • DOI
    10.1109/TNNLS.2012.2186825
  • Filename
    6153383