DocumentCode
533636
Title
An Adaptive Ontology Based Hierarchical Browsing System for CiteSeerx
Author
Ye, Nanhong ; Gauch, Susan ; Wang, Qiang ; Luong, Hiep
Author_Institution
CSCE Dept., Univ. of Arkansas, Fayetteville, AR, USA
fYear
2010
fDate
7-9 Oct. 2010
Firstpage
203
Lastpage
208
Abstract
As an indispensable technique in addition to the field of Information Retrieval, Ontology based Retrieval System (or Browsing Hierarchy) has been well studied and developed both in academia and industry. However, most of current systems suffer the following problems: (1) Constructing the mappings between documents and concepts in ontology requires the training of robust hierarchical classifiers; it´s difficult to build such classifiers for large-scale documents corpus due to the time-efficiency and precision issues. (2) The traditional Browsing Hierarchical System ignores the distribution of documents over concepts, which is not realistic when a large number of documents distributed biasly on certain concepts. Browsing documents such concepts becomes time-consuming and unpractical for users. Therefore, further splitting these concepts into sub-categories is necessary and critical for organizing documents in the browsing system. Aiming at building the Hierarchical Browsing System more realistically and accurately, we propose an adaptive Hierarchical Browsing System framework in this paper, which is designed to build a Browsing Hierarchy for CiteSeerx. In this framework, we first investigate the supervised learning approaches to classify documents into existing predefined concepts of ontology and compare their performance on different datasets of CiteSeerx. Then, we give a empirical analysis of unsupervised learning methods for adding new clusters to the existing browsing hierarchy. Experimental analysis on CiteSeerx corpus shows the effectiveness and the efficiency of our method.
Keywords
classification; document handling; information retrieval; learning (artificial intelligence); ontologies (artificial intelligence); CiteSeer; adaptive ontology; documents distribution; hierarchical browsing system; hierarchical classifiers; information retrieval; large-scale documents; ontology based retrieval system; supervised learning; unsupervised learning; Buildings; Classification algorithms; Clustering algorithms; Entropy; Indexes; Ontologies; Partitioning algorithms; Browsing System; Ontology; Unsupervised Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Knowledge and Systems Engineering (KSE), 2010 Second International Conference on
Conference_Location
Hanoi
Print_ISBN
978-1-4244-8334-1
Type
conf
DOI
10.1109/KSE.2010.32
Filename
5632004
Link To Document