• DocumentCode
    475882
  • Title

    A Hybrid Text Classification Model based on Rough Sets and Genetic Algorithms

  • Author

    Wang, Xiaoyue ; Hua, Zhen ; Bai, Rujiang

  • Author_Institution
    Libr., Shandong Univ. of Technol., Zibo
  • fYear
    2008
  • fDate
    6-8 Aug. 2008
  • Firstpage
    971
  • Lastpage
    977
  • Abstract
    Automatic categorization of documents into pre-defined taxonomies is a crucial step in data mining and knowledge discovery. Standard machine learning techniques like support vector machines(SVM) and related large margin methods have been successfully applied for this task. Unfortunately, the high dimensionality of input feature vectors impacts on the classification speed. The kernel parameters setting for SVM in a training process impacts on the classification accuracy. Feature selection is another factor that impacts classification accuracy. The objective of this work is to reduce the dimension of feature vectors, optimizing the parameters to improve the SVM classification accuracy and speed. In order to improve classification speed we spent rough sets theory to reduce the feature vector space. We present a genetic algorithm approach for feature selection and parameters optimization to improve classification accuracy. Experimental results indicate our method is more effective than traditional SVM methods and other traditional methods.
  • Keywords
    data mining; pattern classification; rough set theory; support vector machines; text analysis; SVM classification accuracy; classification speed; data mining; documents automatic categorization; feature selection; genetic algorithms; hybrid text classification; knowledge discovery; machine learning techniques; rough sets theory; support vector machines; Artificial intelligence; Distributed computing; Genetic algorithms; Kernel; Organizing; Rough sets; Software engineering; Support vector machine classification; Support vector machines; Text categorization; Document Classification; Genetic Algorithms; Rough Sets; Support Vector Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2008. SNPD '08. Ninth ACIS International Conference on
  • Conference_Location
    Phuket
  • Print_ISBN
    978-0-7695-3263-9
  • Type

    conf

  • DOI
    10.1109/SNPD.2008.142
  • Filename
    4617495