• DocumentCode
    533636
  • Title

    An Adaptive Ontology Based Hierarchical Browsing System for CiteSeerx

  • Author

    Ye, Nanhong ; Gauch, Susan ; Wang, Qiang ; Luong, Hiep

  • Author_Institution
    CSCE Dept., Univ. of Arkansas, Fayetteville, AR, USA
  • fYear
    2010
  • fDate
    7-9 Oct. 2010
  • Firstpage
    203
  • Lastpage
    208
  • Abstract
    As an indispensable technique in addition to the field of Information Retrieval, Ontology based Retrieval System (or Browsing Hierarchy) has been well studied and developed both in academia and industry. However, most of current systems suffer the following problems: (1) Constructing the mappings between documents and concepts in ontology requires the training of robust hierarchical classifiers; it´s difficult to build such classifiers for large-scale documents corpus due to the time-efficiency and precision issues. (2) The traditional Browsing Hierarchical System ignores the distribution of documents over concepts, which is not realistic when a large number of documents distributed biasly on certain concepts. Browsing documents such concepts becomes time-consuming and unpractical for users. Therefore, further splitting these concepts into sub-categories is necessary and critical for organizing documents in the browsing system. Aiming at building the Hierarchical Browsing System more realistically and accurately, we propose an adaptive Hierarchical Browsing System framework in this paper, which is designed to build a Browsing Hierarchy for CiteSeerx. In this framework, we first investigate the supervised learning approaches to classify documents into existing predefined concepts of ontology and compare their performance on different datasets of CiteSeerx. Then, we give a empirical analysis of unsupervised learning methods for adding new clusters to the existing browsing hierarchy. Experimental analysis on CiteSeerx corpus shows the effectiveness and the efficiency of our method.
  • Keywords
    classification; document handling; information retrieval; learning (artificial intelligence); ontologies (artificial intelligence); CiteSeer; adaptive ontology; documents distribution; hierarchical browsing system; hierarchical classifiers; information retrieval; large-scale documents; ontology based retrieval system; supervised learning; unsupervised learning; Buildings; Classification algorithms; Clustering algorithms; Entropy; Indexes; Ontologies; Partitioning algorithms; Browsing System; Ontology; Unsupervised Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge and Systems Engineering (KSE), 2010 Second International Conference on
  • Conference_Location
    Hanoi
  • Print_ISBN
    978-1-4244-8334-1
  • Type

    conf

  • DOI
    10.1109/KSE.2010.32
  • Filename
    5632004