• DocumentCode
    2840644
  • Title

    A mutual information and information entropy pair based feature selection method in text classification

  • Author

    Pei, Zhili ; Zhou, Yuxin ; Liu, Lisha ; Wang, Lihua ; Lu, Yinan ; Kong, Ying

  • Author_Institution
    Coll. of Comput. Sci. & Technol. Inner, Mongolia Univ. for the Nat. Tongliao, Tongliao, China
  • Volume
    6
  • fYear
    2010
  • fDate
    22-24 Oct. 2010
  • Abstract
    Text classification is an important research field of data mining topics. This article brings a mutual information and information entropy pair based feature selection method (MIIEP_FS) based on the theory of information entropy and information entropy pair concept. This method measure the classification effect using feature by mutual information method and show the difference extent between the features being selected and the ones selected by information entropy. The experimental results show that the MIIEP_FS method proposed is more effective than MI and CHI methods. Macro F1 degrees of different kinds of machine learning algorithms: Naive Bayes and KNN method are higher by MIIEP_FS method, sometimes even more than the ones of support vector machines.
  • Keywords
    data mining; learning (artificial intelligence); pattern classification; support vector machines; text analysis; MIIEP_FS; data mining topics; feature selection method; machine learning algorithms; mutual information and information entropy pair based feature selection method; support vector machines; text classification; feature selection; information entropy; text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Application and System Modeling (ICCASM), 2010 International Conference on
  • Conference_Location
    Taiyuan
  • Print_ISBN
    978-1-4244-7235-2
  • Electronic_ISBN
    978-1-4244-7237-6
  • Type

    conf

  • DOI
    10.1109/ICCASM.2010.5620805
  • Filename
    5620805