• DocumentCode
    2850345
  • Title

    Aligning boundary in kernel space for learning imbalanced dataset

  • Author

    Wu, Gang ; Chang, Edward Y.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., California Univ., Santa Barbara, CA, USA
  • fYear
    2004
  • fDate
    1-4 Nov. 2004
  • Firstpage
    265
  • Lastpage
    272
  • Abstract
    An imbalanced training dataset poses serious problem for many real-world supervised learning tasks. In this paper, we propose a kernel-boundary-alignment algorithm, which considers training-data imbalance as prior information to augment SVMs to improve class-prediction accuracy. Using a simple example, we first show that SVMs can suffer from high incidences of false negatives when the training instances of the target class are heavily outnumbered by the training instances of a nontarget class. The remedy we propose is to adjust the class boundary by modifying the kernel matrix, according to the imbalanced data distribution. Through theoretical analysis backed by empirical study, we show that our kernel-boundary-alignment algorithm works effectively on several datasets.
  • Keywords
    learning (artificial intelligence); matrix algebra; support vector machines; SVM; class prediction; imbalanced dataset learning; kernel boundary alignment; kernel matrix; supervised learning; training-data imbalance; Algorithm design and analysis; Bayesian methods; Data engineering; Kernel; Machine learning algorithms; Supervised learning; Support vector machine classification; Support vector machines; Surveillance; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
  • Print_ISBN
    0-7695-2142-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2004.10106
  • Filename
    1410293