DocumentCode :
533636
Title :
An Adaptive Ontology Based Hierarchical Browsing System for CiteSeerx
Author :
Ye, Nanhong ; Gauch, Susan ; Wang, Qiang ; Luong, Hiep
Author_Institution :
CSCE Dept., Univ. of Arkansas, Fayetteville, AR, USA
fYear :
2010
fDate :
7-9 Oct. 2010
Firstpage :
203
Lastpage :
208
Abstract :
As an indispensable technique in addition to the field of Information Retrieval, Ontology based Retrieval System (or Browsing Hierarchy) has been well studied and developed both in academia and industry. However, most of current systems suffer the following problems: (1) Constructing the mappings between documents and concepts in ontology requires the training of robust hierarchical classifiers; it´s difficult to build such classifiers for large-scale documents corpus due to the time-efficiency and precision issues. (2) The traditional Browsing Hierarchical System ignores the distribution of documents over concepts, which is not realistic when a large number of documents distributed biasly on certain concepts. Browsing documents such concepts becomes time-consuming and unpractical for users. Therefore, further splitting these concepts into sub-categories is necessary and critical for organizing documents in the browsing system. Aiming at building the Hierarchical Browsing System more realistically and accurately, we propose an adaptive Hierarchical Browsing System framework in this paper, which is designed to build a Browsing Hierarchy for CiteSeerx. In this framework, we first investigate the supervised learning approaches to classify documents into existing predefined concepts of ontology and compare their performance on different datasets of CiteSeerx. Then, we give a empirical analysis of unsupervised learning methods for adding new clusters to the existing browsing hierarchy. Experimental analysis on CiteSeerx corpus shows the effectiveness and the efficiency of our method.
Keywords :
classification; document handling; information retrieval; learning (artificial intelligence); ontologies (artificial intelligence); CiteSeer; adaptive ontology; documents distribution; hierarchical browsing system; hierarchical classifiers; information retrieval; large-scale documents; ontology based retrieval system; supervised learning; unsupervised learning; Buildings; Classification algorithms; Clustering algorithms; Entropy; Indexes; Ontologies; Partitioning algorithms; Browsing System; Ontology; Unsupervised Learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge and Systems Engineering (KSE), 2010 Second International Conference on
Conference_Location :
Hanoi
Print_ISBN :
978-1-4244-8334-1
Type :
conf
DOI :
10.1109/KSE.2010.32
Filename :
5632004
Link To Document :
بازگشت