DocumentCode
397769
Title
Classifying Web pages using adaptive ontology
Author
Noh, Sanguk ; Seo, Aaesung ; Choi, Jaehyuk ; Choi, Kyunghee ; Jung, Gihyun
Author_Institution
Sch. of Comput. Sci., Catholic Univ. of Korea, South Korea
Volume
3
fYear
2003
fDate
5-8 Oct. 2003
Firstpage
2144
Abstract
In this paper, we present an automated Web page classifier based on adaptive ontology. As a first step, to identify the representative terms given a set of classes, we compute the product of term frequency and document frequency. Secondly, the information gain of each term prioritizes it based on the possibility of classification. We compile the selected terms and classification into rules using machine learning algorithms. The compiled rules classify any Web page into categories defined on a domain ontology. In the experiments, 11 terms out of 1,700 terms were identified as representative features given a set of Web pages. The resulting accuracy of the classification was, on the average, 95.2%.
Keywords
Web sites; classification; information retrieval; learning (artificial intelligence); Web page classifier; adaptive ontology; document frequency; information gain; machine learning algorithms; representative features; term frequency; Communication networks; Computer science; Frequency; Information analysis; Machine learning; Machine learning algorithms; Ontologies; Protocols; Waste materials; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2003. IEEE International Conference on
ISSN
1062-922X
Print_ISBN
0-7803-7952-7
Type
conf
DOI
10.1109/ICSMC.2003.1244201
Filename
1244201
Link To Document