• DocumentCode
    2068590
  • Title

    A new feature selection method based on distributional information for Text Classification

  • Author

    Shi, Nianyun ; Liu, Lingling

  • Author_Institution
    Coll. of Comput. & Commun. Eng., China Univ. of Pet. (East China), Dongying, China
  • Volume
    1
  • fYear
    2010
  • fDate
    10-12 Dec. 2010
  • Firstpage
    190
  • Lastpage
    194
  • Abstract
    Feature Selection (FS) is one of the most important issues in Text Classification (TC). A good feature selection can improve the efficiency and accuracy of a text classifier. Based on the analysis of the feature´s distributional information, this paper presents a feature selection method named DIFS. In DIFS a new estimation mechanism is proposed to measure the relevance between feature´s distribution characteristics and contribution to categorization. In addition, two kinds of algorithms are designed to implement DIFS. Experiments are carried out on a Chinese corpus and by comparison the proposed approach shows a better performance.
  • Keywords
    classification; estimation theory; natural language processing; text analysis; Chinese corpus; DIFS; distributional information; estimation mechanism; feature selection method; text classification; text classifier; Estimation; Text categorization; Distributional Information; Feature Selection (FS); Text Classification(TC);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Progress in Informatics and Computing (PIC), 2010 IEEE International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-6788-4
  • Type

    conf

  • DOI
    10.1109/PIC.2010.5687404
  • Filename
    5687404