• DocumentCode
    2232851
  • Title

    An approach of multi-hierarchy text classification

  • Author

    Liu, Shaohui ; Dong, Mingkai ; Zhang, Haijun ; Li, Rong ; Shi, Zhongzhi

  • Author_Institution
    Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
  • Volume
    3
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    95
  • Abstract
    Improves on the classical formula of calculating the term weight in the vector space model. Furthermore, an approach to multi-hierarchy text classification based on the vector space model is proposed. In this approach, all classes are organized as a tree according to some given hierarchical relations, and all the training documents in a class are combined into a class-document. In order to construct the class models, only the class-documents attached to the same node of the same layer are compared. When classifying the documents, one matching process is hierarchically performed from the root node to the leaf nodes until a corresponding subclass is found. The experiment and real systems indicate that the approach is of high classification precision and recall
  • Keywords
    entropy; probability; text analysis; class-document; feature selection; hierarchical relations; information gain; leaf nodes; matching process; multi-hierarchy text classification; root node; training documents; vector space model; Computers; Frequency; Information processing; Information technology; Internet; Laboratories; Space technology; Support vector machine classification; Support vector machines; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Info-tech and Info-net, 2001. Proceedings. ICII 2001 - Beijing. 2001 International Conferences on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-7010-4
  • Type

    conf

  • DOI
    10.1109/ICII.2001.983042
  • Filename
    983042