• DocumentCode
    2031792
  • Title

    Automatic filtering algorithm for imbalanced classification

  • Author

    Gong, Wei ; Zhou, Youjie ; Luo, Hangzai ; Fan, Jianping ; Zhou, Aoying

  • Author_Institution
    Massive Comput. Inst., East China Normal Univ., Shanghai, China
  • Volume
    4
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    1853
  • Lastpage
    1857
  • Abstract
    The imbalanced data set has been reported to hinder the classification performance of many machine learning algorithms on both accuracy and speed. But extremely imbalanced data sets (3~5% positive samples) are common for many applications, such as multimedia semantic classification. In this paper, we propose a novel algorithm to automatically remove samples that have no or negative effects on classifier training for imbalanced training data sets. By using our algorithm, most easy-to-classify dominant-class samples in imbalanced training set will be eliminated automatically. As a result, the ratio of minority class samples is increased significantly, making it more suitable for classification algorithms. Experiments show that our algorithm can keep the classification accuracy of SVM, and decrease the training time dramatically.
  • Keywords
    information filtering; learning (artificial intelligence); pattern classification; support vector machines; SVM; automatic filtering algorithm; classifier training; imbalanced data classification; machine learning algorithms; training data sets; Accuracy; Algorithm design and analysis; Feature extraction; Machine learning algorithms; Support vector machines; Training; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569437
  • Filename
    5569437