• DocumentCode
    2252124
  • Title

    Sample selection based on maximum entropy for support vector machines

  • Author

    Wang, Ran ; Kwong, Sam

  • Author_Institution
    Dept. of Comput. Sci., City Univ. of Hong Kong, Kowloon, China
  • Volume
    3
  • fYear
    2010
  • fDate
    11-14 July 2010
  • Firstpage
    1390
  • Lastpage
    1395
  • Abstract
    It is always true that in the classification problems, unlabeled data is abundant while the cost for labeling data is expensive. In addition, large data sets often contain redundancy hence degrade the performance of the classifiers. In order to guarantee the generalization capability of the classifiers, a certain number of suitable unlabeled samples need to be selected out and labeled. This process is referred to as sample selection. In this paper, we propose an active learning model of sample selection for support vector machines based on the measurement of neighborhood entropy. In order to evaluate the capability of the generated SVMs, experiments have been conducted on several benchmark data sets. Comparisons between our proposed method and the random selecting method have also been conducted.
  • Keywords
    entropy; learning (artificial intelligence); pattern classification; redundancy; support vector machines; active learning model; classification problems; data labeling; large data sets; maximum entropy; neighborhood entropy measurement; random selecting method; redundancy; sample selection; support vector machines; Classification algorithms; Complexity theory; Entropy; Machine learning; Support vector machines; Training; Uncertainty; Active learning; Neighborhood entropy; SVM; Sample selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4244-6526-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2010.5580848
  • Filename
    5580848