• DocumentCode
    2866920
  • Title

    An Improved X2 (CHI) Statistics Method for Text Feature Selection

  • Author

    Yan, Tang ; Ting, Xiao

  • Author_Institution
    Coll. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China
  • fYear
    2009
  • fDate
    11-13 Dec. 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Feature selection is a hot topic in current search field, especially in the field of text categorization. To overcome the shortcomings of traditional χ2 (CHI) approach, an improved χ2 (CHI) statistics method is proposed in this paper. It comprehensively takes criterions such as Document Frequency and Class Accuracy of the traditional statistical methods to improve χ2 (CHI) statistical method. The experiments results show that the proposed method is more effective than the traditional χ2 (CHI) method.
  • Keywords
    data mining; statistical analysis; 2 CHI statistics method; class accuracy criterion; document frequency criterion; feature selection; text categorization; Data mining; Educational institutions; Entropy; Frequency; Information science; Mutual information; Statistical analysis; Statistics; Text categorization; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-4507-3
  • Electronic_ISBN
    978-1-4244-4507-3
  • Type

    conf

  • DOI
    10.1109/CISE.2009.5366401
  • Filename
    5366401