• DocumentCode
    2641045
  • Title

    A Approach for Text Classification Feature Dimensionality Reduction and Rule Generation on Rough Set

  • Author

    Yin, Shiqun ; Huang, ZhiXing ; Chen, Lu ; Qiu, Yuhui

  • Author_Institution
    Fac. of Comput. & Inf. Sci., Southwest Univ., Chongqing
  • fYear
    2008
  • fDate
    18-20 June 2008
  • Firstpage
    554
  • Lastpage
    554
  • Abstract
    The high dimensional data are frequently met when we apply Web text classification. Mining in high dimensional data is extraordinarily difficult because of the curse of dimensionality. We must adopt feature dimensionality reduction to solve these problems. A attribute reduction algorithm based on rough set theory is given in this paper to reduce the text feature term and extract rule. First, the weight of feature term is made discrete. Then, the decision table is made with weight as the condition attributes and classes of texts as the decision attributes. Finally, the classification rules are extracted by attribute reduction. The method is simple and feasible. It is advantageous in improving the efficiency of the selected feature subset and suitable for high-volume text classification. The extracted rules are easy understand. The accuracy is higher and the speed of classification is faster than the classification based on vector space comparison. This paper describes the proposed technique and provides experimental results.
  • Keywords
    data mining; rough set theory; text analysis; Web text classification; attribute reduction algorithm; classification rules; decision attributes; feature dimensionality reduction; high dimensional data mining; rough set theory; rule extraction; rule generation; vector space comparison; Data mining; Feature extraction; Information filtering; Information retrieval; Information science; Internet; Search engines; Set theory; Space technology; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Computing Information and Control, 2008. ICICIC '08. 3rd International Conference on
  • Conference_Location
    Dalian, Liaoning
  • Print_ISBN
    978-0-7695-3161-8
  • Electronic_ISBN
    978-0-7695-3161-8
  • Type

    conf

  • DOI
    10.1109/ICICIC.2008.7
  • Filename
    4603742