DocumentCode :
3286022
Title :
Hierarchical Document Classification Based on a Backtracking Algorithm
Author :
Zhu, Cuiling ; Ma, Jun ; Zhang, DongMei ; Han, Xiaohui ; Niu, Xiaofei
Author_Institution :
Sch. of Comput. Sci. &Technol., Shandong Univ., Jinan
Volume :
2
fYear :
2008
fDate :
18-20 Oct. 2008
Firstpage :
467
Lastpage :
471
Abstract :
Hierarchical document classification refers to assigning one or more suitable categories from a hierarchical category space to a document. This paper proposes a new hierarchical document classification method based on a backtracking algorithm. Utilizing the relationships between categories in category tree, a suitable threshold for every category is found to determine whether a document could be classified into the category. And the backtracking algorithm in our hierarchical classification approach effectively solves the problem that a misclassification at higher level directly leads to the misclassification at a lower level. Moreover, feature set is selected by integrating information gain with hierarchy information, which accords with the characteristic of a category tree. Experiments show that the method performs well when enough training documents are given.
Keywords :
backtracking; classification; document handling; feature extraction; tree searching; backtracking algorithm; category tree; feature set selection; hierarchical document classification method; Classification tree analysis; Computer science; Conference management; Fuzzy systems; Information management; Information retrieval; Knowledge management; Machine learning; Space technology; Technology management; Backtracking Algorithm; Hierarchical Document Classification; information gain;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Shandong
Print_ISBN :
978-0-7695-3305-6
Type :
conf
DOI :
10.1109/FSKD.2008.346
Filename :
4666161
Link To Document :
بازگشت