• DocumentCode
    860697
  • Title

    Bayes Vector Quantizer for Class-Imbalance Problem

  • Author

    Diamantini, Claudia ; Potena, Domenico

  • Author_Institution
    Dipt. di Ing. Inf., Gestionale e dell´´Autom., Universitd Politec. delle Marche, Ancona
  • Volume
    21
  • Issue
    5
  • fYear
    2009
  • fDate
    5/1/2009 12:00:00 AM
  • Firstpage
    638
  • Lastpage
    651
  • Abstract
    The class-imbalance problem is the problem of learning a classification rule from data that are skewed in favor of one class. On these datasets traditional learning techniques tend to overlook the less numerous class, at the advantage of the majority class. However, the minority class is often the most interesting one for the task at hand. For this reason, the class-imbalance problem has received increasing attention in the last few years. In the present paper we point the attention of the reader to a learning algorithm for the minimization of the average misclassification risk. In contrast to some popular class-imbalance learning methods, this method has its roots in statistical decision theory. A particular interesting characteristic is that when class distributions are unknown, the method can work by resorting to stochastic gradient algorithm. We study the behavior of this algorithm on imbalanced datasets, demonstrating that this principled approach allows to obtain better classification performances compared to the principal methods proposed in the literature.
  • Keywords
    Bayes methods; decision theory; gradient methods; learning (artificial intelligence); minimisation; vector quantisation; Bayes vector quantizer; class-imbalance problem; classification rule; learning; minimization; misclassification risk; statistical decision theory; stochastic gradient algorithm; Classifier design and evaluation; Clustering; Data mining; Machine learning; Mining methods and algorithms; and association rules; classification;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2008.187
  • Filename
    4624261