• DocumentCode
    397769
  • Title

    Classifying Web pages using adaptive ontology

  • Author

    Noh, Sanguk ; Seo, Aaesung ; Choi, Jaehyuk ; Choi, Kyunghee ; Jung, Gihyun

  • Author_Institution
    Sch. of Comput. Sci., Catholic Univ. of Korea, South Korea
  • Volume
    3
  • fYear
    2003
  • fDate
    5-8 Oct. 2003
  • Firstpage
    2144
  • Abstract
    In this paper, we present an automated Web page classifier based on adaptive ontology. As a first step, to identify the representative terms given a set of classes, we compute the product of term frequency and document frequency. Secondly, the information gain of each term prioritizes it based on the possibility of classification. We compile the selected terms and classification into rules using machine learning algorithms. The compiled rules classify any Web page into categories defined on a domain ontology. In the experiments, 11 terms out of 1,700 terms were identified as representative features given a set of Web pages. The resulting accuracy of the classification was, on the average, 95.2%.
  • Keywords
    Web sites; classification; information retrieval; learning (artificial intelligence); Web page classifier; adaptive ontology; document frequency; information gain; machine learning algorithms; representative features; term frequency; Communication networks; Computer science; Frequency; Information analysis; Machine learning; Machine learning algorithms; Ontologies; Protocols; Waste materials; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2003. IEEE International Conference on
  • ISSN
    1062-922X
  • Print_ISBN
    0-7803-7952-7
  • Type

    conf

  • DOI
    10.1109/ICSMC.2003.1244201
  • Filename
    1244201