• DocumentCode
    1879213
  • Title

    A Novel Classifier-Independent Feature Selection Algorithm for Imbalanced Datasets

  • Author

    Zhu, Quanyin ; Cao, Suqun

  • Author_Institution
    Dept. of Comput. Eng., Huaiyin Inst. of Technol., Huaiyin, China
  • fYear
    2009
  • fDate
    27-29 May 2009
  • Firstpage
    77
  • Lastpage
    82
  • Abstract
    A novel classifier-independent feature selection algorithm based on the posterior probability is proposed for imbalanced datasets. First, an imbalanced factor is introduced and computed by Parzen-window estimation. The middle point of Tomek links is chosen as the initial point. Accordingly, this algorithm is iterated to find out the boundary points which have the equality of posterior probability. Through the project computation on the normal vectors of these points, the weight of each feature can be obtained, which actually indicates the importance degree of each feature. The experimental results on 3 real-word datasets demonstrate that this proposed algorithm can not only reduce the computational cost but also overcome the shortcoming that the majority class may be detected well but the minority class may be ignored in the conventional feature selection algorithm.
  • Keywords
    estimation theory; pattern classification; probability; Parzen-window estimation; Tomek link; classifier-independent feature selection; imbalanced dataset; posterior probability; Artificial intelligence; Computer networks; Concurrent computing; Data engineering; Distributed computing; Electronic mail; Intelligent networks; Mechanical engineering; Software algorithms; Software engineering; feature selection; imbalanced datasets; posterior probability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, 2009. SNPD '09. 10th ACIS International Conference on
  • Conference_Location
    Daegu
  • Print_ISBN
    978-0-7695-3642-2
  • Type

    conf

  • DOI
    10.1109/SNPD.2009.47
  • Filename
    5286691