• DocumentCode
    3039307
  • Title

    Design of Chinese Text Categorization Classifier Based on Attribute Bagging

  • Author

    Zhang, Xiang ; Zhou, Mingquan ; Dong, Lili ; Ye, Na

  • Author_Institution
    Coll. of Inf. Sci. & Technol., Northwest Univ., Xi´´an, China
  • fYear
    2009
  • fDate
    24-26 July 2009
  • Firstpage
    201
  • Lastpage
    204
  • Abstract
    In order to improve the precise rate and recall rate of Chinese text classifier, an improved bagging algorithm - attribute bagging is used in this paper. Document is represented by vector space model and information gain is used to do the feature selection. Re-sampling attributes is used to get multiple training sets and the kNN is selected as the individual classifier. The classification result is attained by voting. Experiments show that the attribute bagging gets lower errors and better performance than bagging and kNN in Chinese text categorization.
  • Keywords
    support vector machines; text analysis; Chinese text categorization classifier; attribute bagging algorithm; information gain; multiple training set; resampling attributes; support vector machine; vector space model; Algorithm design and analysis; Bagging; Boosting; Control engineering; Educational institutions; Frequency; Information science; Machine learning; Space technology; Text categorization; Chinese text categorization; attribute bagging; information gain; vector space model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Business Intelligence and Financial Engineering, 2009. BIFE '09. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-0-7695-3705-4
  • Type

    conf

  • DOI
    10.1109/BIFE.2009.55
  • Filename
    5208903