DocumentCode :
2120391
Title :
Automatic Maintenance of the Category Hierarchy
Author :
Lei He ; Xiaoping Sun
Author_Institution :
Knowledge Grid Lab., Key Lab. of Intell. Inf. Process. Inst. of Comput. Technol., China
fYear :
2013
fDate :
3-4 Oct. 2013
Firstpage :
218
Lastpage :
221
Abstract :
Hierarchical models becomes one of the most widely-adopted and effective solutions in organizing large volume of documents. Although there are general taxonomies on the Web, we observe that in most cases there will be many inconsistencies between general taxonomy and specific resources as the generation of taxonomies is independent of the resources. Besides with the newly available resources into the hierarchy, the internal patterns of categories will change due to emerging some new topics or merging some original topics. These problems will result in a poor taxonomy and it will finally hurt user experience in searching and locating resources. So in this paper, we propose an effective method AMHC (Automatic Maintenance of Hierarchical Classification) to modifying an existing taxonomy by a two-phrase adjustment from the global and local perspective respectively. The first phrase is to modify the hierarchical taxonomy into more topically cohesive categories based on resources clustering, which can speed up the category´s inter-branches movements. The second phrase is to accomplish localized modification with three operations (namely Pull-Up, Merge and Split) for better classification performance. We conduct experiments on Reuters-21578 and 20Newsgroup dataset, report improved performance and make some detailed explanations.
Keywords :
classification; 20Newsgroup dataset; AMHC method; Reuters-21578 dataset; automatic maintenance-of-hierarchical classification method; category interbranch movements; classification performance improvement; document organization; global perspective; hierarchical taxonomy modification; internal patterns; local perspective; merge operation; pull-up operation; resource clustering; split operation; topically cohesive categories; two-phrase adjustment; Accuracy; Computational modeling; Internet; Maintenance engineering; Semantics; Taxonomy; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantics, Knowledge and Grids (SKG), 2013 Ninth International Conference on
Conference_Location :
Beijing
Type :
conf
DOI :
10.1109/SKG.2013.35
Filename :
6816611
Link To Document :
بازگشت